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PCT/US97/14593 



LONG WAVELENGTH ENGINEERED FLUORESCENT PROTEINS 

BACKGROUND OF THE INVENTION 

This application claims the benefit of the earlier filing date of a United States 

60/024,050 

provisional patent application serial number . ■ filed on August 16, 1996 entitled 

"Long Wavelength Mutant Fluorescent Proteins" and patent application serial number 
08/706 , A08filed on August 30, 1996 entitled "Long Wavelength Engineered Fluorescent 
Proteins," both of which are herein incorporated by reference. 

This invention was made in part with Government support under grant no. 
MCB 9418479 awarded by the National Science Foundation. The Government may have 
rights in this invention. 

Fluorescent molecules are attractive as reporter molecules in many assay 
systems because of their high sensitivity and ease of quantification. Recently, fluorescent 
proteins have been the focus of much attention because they can be produced in vivo by 
biological systems, and can be used to trace intracellular events without the need to be 
introduced into the cell through microinjection or permeabilization. The green fluorescent 
protein of Aequorea victoria is particularly interesting as a fluorescent protein. A cDNA for 
the protein has been cloned. (D.C. Prasher et aL, "Primary structure of the Aequorea 
victoria green-fluorescent protein," Gene (1992) 1 1 1:229-33.) Not only can the primary 
amino acid sequence of the protein be expressed from the cDNA, but the expressed protein 
can fluoresce. This indicates that the protein can undergo the cyclization and oxidation 
believed to be necessary for fluorescence. Aequorea green fluorescent protein 

("GFP") is a stable, proteolysis-resistant single chain of 238 residues and has two absorption 
maxima at around 395 and 475 nm. The relative amplitudes of these two peaks is sensitive 
to environmental factors (W. W. Ward. Bioluminescence and Chemiluminescence (M. A. 
DeLuca and W. D. McElroy, eds) Academic Press pp. 235-242 (1981); W. W. Ward & S. 
H. Bokman Biochemistry 21:4535-4540 (1982); W. W. Ward et aL Photochem. Photobiol 
35:803-808 (1982)) and illumination history (A. B. Cubitt et al. Trends Biochem. Sci. 
20:448-455 (1995)), presumably reflecting two or more ground states. Excitation at the 
primary absorption peak of 395 nm yields an emission maximum at 508 nm with a quantum 
yield of 0.72-0.85 (O. Shimomura and F.H. Johnson/ Cell. Comp. Physiol 59:223 (1962); 
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J. G. Morin and J. W. Hastings, 1 Cell Physiol. 77:313 (1971); H. Morise et aL 
Biochemistry 13:2656 (1974); W. W. Ward Photochem. Photobiol Reviews (Smith, K. C. 
ed.) 4:1 (1979); A. B. Cubitt et ai. Trends Biochem. Sci. 20:448-455 (1995); D. C. Prasher 
Trends Genet. 1 1:320-323 (1995); M. Chalfie Photochem. Photobiol. 62:651-656 (1995); 
W. W. Ward. Bioluminescence and Chemiluminescence (M. A. DeLuca and W. D. 
McElroy, eds) Academic Press pp. 235-242 (1981); W. W. Ward & S. H. Bokman 
Biochemistry 21 :4535-4540 (1982); W. W. Ward et al. Photochem. Photobiol 35:803-808 
(1982)). The fluorophore results from the autocatalytic cyclization of the polypeptide 
backbone between residues Ser 65 and Gly 67 and oxidation of the O-B bond of Tyr 66 (A. B. 
Cubitt et al. Trends Biochem. Sci. 20:448-455 (1995); C W. Cody et aL Biochemistry 
32:1212-1218 (1993); R. Heim et al. Proc. Natl Acad. Sci. USA 91:12501-12504 (1994)). 
Mutation of Ser 63 to Thr (S65T) simplifies the excitation spectrum to a single peak at 488 
nm of enhanced amplitude (R. Heim et al. Nature 373:664-665 (1995)), which no longer 
gives signs of conformational isomers (A. B. Cubitt et al. Trends Biochem. Sci. 20:448-455 
(1995)). 

Fluorescent proteins have been used as markers of gene expression, tracers of 
cell lineage and as fusion tags to monitor protein localization within living cells. (M. 
Chalfie et al., "Green fluorescent protein as a marker for gene expression," Science 263:802- 
805; A.B. Cubitt et al., "Understanding, improving and using green fluorescent proteins," 
TIBS 20, November 1995, pp. 448-455. U.S. patent 5,491,084, M. Chalfie and D. Prasher. 
Furthermore, engineered versions oiAequorea green fluorescent protein have been 
identified that exhibit altered fluorescence characteristics, including altered excitation and 
emission maxima, as well as excitation and emission spectra of different shapes. (R. Heim 
et aL, "Wavelength mutations and posttranslational autoxidation of green fluorescent 
protein," Proc. Natl Acad. Sci. USA, (1994) 91:12501-04; R. Heim et aL, "Improved green 
fluorescence," Nature (1995) 373:663-665.) These properties add variety and utility to the 
arsenal of biologically based fluorescent indicators. 

There is a need for engineered fluorescent proteins with varied fluorescent 

properties. 



BRIEF DESCRIPTION OF THE DRAWINGS 

Figs. 1 A-1B. (A) Schematic drawing of the backbone of GFP produced by 
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Molscript (J.P. Kraulis, 1 AppL CrysL, 24:946 (1991)). The chromophore is shown as a 
ball and stick model. (B) Schematic drawing of the overall fold of GFP. Approximate 
residue numbers mark the beginning and ending of the secondary structure elements. 

Figs. 2A-2C. (A) Stereo drawing of the chromophore and residues in the 
5 immediate vicinity. Carbon atoms are drawn as open circles, oxygen is filled and nitrogen is 
shaded. Solvent molecules are shown as isolated filled circles. (B) Portion of the final 2F 0 - 
F c electron density map contoured at 1 .0 showing the electron density surrounding the 
chromophore. (C) Schematic diagram showing the first and second spheres of coordination 
of the chromophore. Hydrogen bonds are shown as dashed lines and have the indicated 
10 lengths in A. Inset: proposed structure of the carbinolamine intermediate that is presumably 
formed during generation of the chromophore. 

Fig. 3 depicts the nucleotide sequence (SEQ ED NO:l) and deduced amino 
acid sequence (SEQ ID NO:2) of an Aequorea green fluorescent protein. 

Fig. 4 depicts the nucleotide sequence (SEQ ID NO:3) and deduced amino 
1 5 acid sequence (SEQ ID NO:4) of the engineered Aequorea-xelMed fluorescent protein 
S65G/S72A/T203Y utilizing preferred mammalian codons and optimal Kozak sequence. 

Figs. 5-1 to 5-28 present the coordinates for the crystal structure of 
Aequorea-refated green fluorescent protein S65T. 

Fig. 6 shows the fluorescence excitation and emission spectra for engineered 
2 0 fluorescent proteins 20A and 10C (Table F). The vertical line at 528 ran compares the 
emission maxima of 10C, to the left of the line, and 20A, to the right of the line. 

SUMMARY OF THE INVENTION 

This invention provides functional engineered fluorescent proteins with 
2 5 varied fluorescence characteristics that can be easily distinguished from currently existing 
green and blue fluorescent proteins. Such engineered fluorescent proteins enable the 
simultaneous measurement of two or more processes within cells and can be used as 
fluorescence energy donors or acceptors when used to monitor protein-protein interactions 
through FRET. Longer wavelength engineered fluorescent proteins are particularly useful 
30 because photodynamic toxicity and auto-fluorescence of cells are significantly reduced at 
longer wavelengths. In particular, the introduction of the substitution T203X, wherein X is 
an aromatic amino acid, results in an increase in the excitation and emission wavelength 
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maxima of Aequorea-related fluorescent proteins. 

In one aspect, this invention provides a nucleic acid molecule comprising a 
nucleotide sequence encoding a functional engineered fluorescent protein whose amino acid 
sequence is substantially identical to the amino acid sequence of Aequorea green fluorescent 
protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 by at least an amino acid 
substitution located no more than about 0.5 nm from the chromophore of the engineered 
fluorescent protein, wherein the substitution alters the electronic environment of the 
chromophore, whereby the functional engineered fluorescent protein has a different 
fluorescent property than Aequorea green fluorescent protein. 

In one aspect this invention provides a nucleic acid molecule comprising a 
nucleotide sequence encoding a functional engineered fluorescent protein whose amino acid 
sequence is substantially identical to the amino acid sequence of Aequorea green fluorescent 
protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 by at least a substitution at 
T203 and, in particular, T203X, wherein X is an aromatic amino acid selected from H, Y, W 
or F, said functional engineered fluorescent protein having a different fluorescent property 
than Aequorea green fluorescent protein. In one embodiment, the amino acid sequence 
further comprises a substitution at S65, wherein the substitution is selected from S65G, 
S65T, S65A, S65L, S65C, S65V and S65I. In another embodiment, the amino acid 
sequence differs by no more than the substitutions S65T/T203H; S65T/T203 Y; 
S72A/F64L/S65G/T203Y; S65G/V6SL/Q69K/S72A/T203Y; S72A/S65G/V68L/T203Y; 
S65G/S72A/T203Y; or S65G/S72A/T203 W. In another embodiment, the amino acid 
sequence further comprises a substitution at Y66, wherein the substitution is selected from 
Y66H, Y66F, and Y66W. In another embodiment, the amino acid sequence further 
comprises a mutation from Table A. In another embodiment, the amino acid sequence 
further comprises a folding mutation. In another embodiment, the nucleotide sequence 
encoding the protein differs from the nucleotide sequence of SEQ ID NO:l by the 
substitution of at least one codon by a preferred mammalian codon. In another embodiment, 
the nucleic acid molecule encodes a fusion protein wherein the fusion protein comprises a 
polypeptide of interest and the functional engineered fluorescent protein. 

In another aspect, this invention provides a nucleic acid molecule comprising 
a nucleotide sequence encoding a functional engineered fluorescent protein whose amino 
acid sequence is substantially identical to the amino acid sequence of Aequorea green 
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fluorescent protein (SEQ ID N0:2) and which differs from SEQ ID NO:2 by at least an 
amino acid substitution at L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 
1167, Q183, N185, L220, E222 (not E222G), or V224, said functional engineered 
fluorescent protein having a different fluorescent property than Aequorea green fluorescent 
protein. In one embodiment, amino acid substitution is: 

L42X, wherein X is selected from C, F, H, W and Y, 

V61X, wherein X is selected from F, Y, H and C, 

T62X, wherein X is selected from A, V, F, S, D, N, Q, Y, H and C, 

V68X, wherein X is selected from F, Y and H, 

Q69X, wherein X is selected from K, R, E and G, 

Q94X, wherein X is selected from D, E, H, K and N, 

N121X, wherein X is selected from F, H, W and Y, 

Y145X, wherein X is selected from W, C, F, L, E, H, K and Q, 

H148X, wherein X is selected from F, Y, N, K, Q and R, 

VI SOX, wherein X is selected from F, Y and H, 

F165X, wherein X is selected from H, Q, W and Y, 

I167X, wherein X is selected from F, Y and H, 

Ql 83X, wherein X is selected from H, Y, E and K, 

Nl 85X, wherein X is selected from D, E, H, K and Q, 

L220X, wherein X is selected from H, N, Q and T, 

E222X, wherein X is selected from N and Q, or 

V224X, wherein X is selected from H, N, Q, T, F, W and Y. 

In a further aspect, this invention provides an expression vector comprising 
expression control sequences operatively linked to any of the aforementioned nucleic acid 
molecules. In a further aspect, this invention provides a recombinant host cell comprising 
the aforementioned expression vector. 

In another aspect, this invention provides a functional engineered fluorescent 
protein whose amino acid sequence is substantially identical to the amino acid sequence of 
Aequorea green fluorescent protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 
by at least an amino acid substitution located no more than about 0.5 nm from the 
chromophore of the engineered fluorescent protein, wherein the substitution alters the 
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electronic environment of the chromophore, whereby the functional engineered fluorescent 
protein has a different fluorescent property than Aequorea green fluorescent protein. 

In another aspect, this invention provides a functional engineered fluorescent 
protein whose amino acid sequence is substantially identical to the amino acid sequence of 
5 Aequorea green fluorescent protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 
by at least the amino acid substitution at T203, and in particular, T203X, wherein X is an 
aromatic amino acid selected from H, Y, W or F, said functional engineered fluorescent 
protein having a different fluorescent property than Aequorea green fluorescent protein. In 
one embodiment, the amino acid sequence further comprises a substitution at S65, wherein 

10 the substitution is selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I. In 
another embodiment, the amino acid sequence differs by no more than the substitutions 
S65T/T203H; S65T/T203Y; S72A/F64L/S65G/T203Y; S72A/S65G/V68I/T203Y; 
S65G/V68L/Q69K/S72A/T203Y; S65G/S72A/T203Y; or S65G/S72iVT203W. In another 
embodiment, the amino acid sequence further comprises a substitution at Y66, wherein the 

15 substitution is selected from Y66K, Y66F, and Y66W. In another embodiment, the amino 
acid sequence further comprises a folding mutation. In another embodiment, the engineered 
fluorescent protein is part of a fusion protein wherein the fusion protein comprises a 
polypeptide of interest and the functional engineered fluorescent protein. 

In another aspect this invention provides a functional engineered fluorescent 

2 0 protein whose amino acid sequence is substantially identical to the amino acid sequence of 

Aequorea green fluorescent protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 
by at least an amino acid substitution at L42, V61, T62, V68, Q69, Q94, N121, Y145, 
H148, V150, F165, 1167, Q183, N185, L220, E222, or V224, said functional engineered 
fluorescent protein having a different fluorescent property than Aequorea green fluorescent 
25 protein. 

In another aspect, this invention provides a fluorescently labelled antibody 
comprising an antibody coupled to any of the aforementioned functional engineered 
fluorescent proteins. In one embodiment, the fluorescently labelled antibody is a fusion 
protein wherein the fusion protein comprises the antibody fused to the functional engineered 

3 0 fluorescent protein. 

In another aspect, this invention provides a nucleic acid molecule comprising 
a nucleotide sequence encoding an antibody fused to a nucleotide sequence encoding a 
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functional engineered fluorescent protein of this invention. 

In another aspect, this invention provides a fluorescently labelled nucleic 
acid probe comprising a nucleic acid probe coupled to a functional engineered fluorescent 
protein whose amino acid sequence of this invention. The fusion can be through a linker 
5 peptide. 

In another aspect, this invention provides a method for determining whether 
a mixture contains a target comprising contacting the mixture with a fluorescently labelled 
probe comprising a probe and a functional engineered fluorescent protein of this invention; 
and determining whether the target has bound to the probe. In one embodiment, the target 

1 0 molecule is captured on a solid matrix. 

In another aspect, this invention provides a method for engineering a 
functional engineered fluorescent protein having a fluorescent property different than 
Aequorea green fluorescent protein, comprising substituting an amino acid that is located no 
more than 0.5 nm from any atom in the chromophore of an Aequorea-velated green 

1 5 fluorescent protein with another amino acid; whereby the substitution alters a fluorescent 
property of the protein. In one embodiment, the amino acid substitution alters the electronic 
environment of the chromophore. 

In another aspect, this invention provides a method for engineering a 
functional engineered fluorescent protein having a different fluorescent property than 

2 0 Aequorea green fluorescent protein comprising substituting amino acids in a loop domain of 
an Aequorea-TQlated green fluorescent protein with ammo acids so as to create a consensus 
sequence for phosphorylation or for proteolysis. 

In another aspect, this invention provides a method for producing 
fluorescence resonance energy transfer comprising providing a donor molecule comprising 

25 a functional engineered fluorescent protein this invention; providing an appropriate acceptor 
molecule for the fluorescent protein; and bringing the donor molecule and the acceptor 
molecule into sufficiently close contact to allow fluorescence resonance energy transfer. 

In another aspect, this invention provides a method for producing 
fluorescence resonance energy transfer comprising providing an acceptor molecule 

30 comprising a functional engineered fluorescent protein of this invention; providing an 

appropriate donor molecule for the fluorescent protein; and bringing the donor molecule and 
the acceptor molecule into sufficiently close contact to allow fluorescence resonance energy 
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transfer. In one embodiment, the donor molecule is a engineered fluorescent protein whose 
amino acid sequence comprises the substitution T203I and the acceptor molecule is an 
engineered fluorescent protein whose amino acid sequence comprises the substitution 
T203X. wherein X is an aromatic amino acid selected from H, Y, W or F, said functional 
5 engineered fluorescent protein having a different fluorescent property than Aequorea green 
fluorescent protein. 

In another aspect, this invention provides a crystal of a protein comprising a 
fluorescent protein with an amino acid sequence substantially identical to SEQ ID NO: 2, 
wherein said crystal diffracts with at least a 2.0 to 3.0 angstrom resolution. 

10 In another embodiment, this invention provides computational method of 

designing a fluoresent protein comprising determining from a three dimensional model of a 
crystallized fluorescent protein comprising a fluorescent protein with a bound ligand, at 
least one interacting amino acid of the fluorescent protein that interacts with at least one 
first chemical moiety of the ligand, and selecting at least one chemical modification of the 

1 5 first chemical moiety to produce a second chemical moiety with a structure to either 
decrease or increase an interaction between the interacting amino acid and the second 
chemical moiety compared to the interaction between the interacting amino acid and the 
first chemical moiety. 

In another embodiment, this invention provides a computational method of 

2 0 modeling the three dimensional structure of a fluorescent protein comprising determining a 
three dimensional relationship between at least two atoms listed in the atomic coordinates of 
Figs. 5-1 to 5-28. 

In another embodiment, this invention provides a device comprising a 
storage device and, stored in the device, at least 10 atomic coordinates selected from the 
2 5 atomic coordinates listed in Figs. 5-1 to 5-28. In one embodiment, the storage device is a 
computer readable device that stores code that receives as input the atomic coordinates. In 
another embodiment, the computer readable device is a floppy disk or a hard drive. 

DETAILED DESCRIPTION OF THE INVENTION 

30 I. DEFINITIONS 

Unless defined otherwise, all technical and scientific terms used herein have 
the same meaning as commonly understood by those of ordinary skill in the art to which 
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this invention belongs. Although any methods and materials similar or equivalent to those 
described herein can be used in the practice or testing of the present invention, the preferred 
methods and materials are described. For purposes of the present invention, the following 
terms are defined below. 
5 "Binding pair" refers to two moieties (e.g. chemical or biochemical) that have an 

affinity for one another. Examples of binding pairs include antigen/antibodies, 
lectin/avidin, target polynucleotide/probe oligonucleotide, antibody/anti-antibody, 
receptor/ligand, enryme/ligand and the like. "One member of a binding pair" refers to one 
moiety of the pair, such as an antigen or ligand. 

1 0 "Nucleic acid" refers to a deoxyribonucleotide or ribonucleotide polymer in either 

single- or double-stranded fonn, and, unless otherwise limited, encompasses known analogs 
of natural nucleotides that can function in a similar manner as naturally occurring 
nucleotides. It will be understood that when a nucleic acid molecule is represented by a 
DNA sequence, this also includes RNA molecules having the corresponding RNA sequence 

15 in which "U n replaces "T." 

"Recombinant nucleic acid molecule" refers to a nucleic acid molecule which 
is not naturally occurring, and which comprises two nucleotide sequences which are not 
naturally joined together. Recombinant nucleic acid molecules are produced by artificial 
recombination, e.g., genetic engineering techniques or chemical synthesis. 

2 0 Reference to a nucleotide sequence "encoding" a polypeptide means that the 

sequence, upon transcription and translation of mRNA, produces the polypeptide. This 
includes both the coding strand, whose nucleotide sequence is identical to mRNA and 
whose sequence is usually provided in the sequence listing, as well as its complementary 
strand, which is used as the template for transcription. As any person skilled in the art 

25 recognizes, this also includes all degenerate nucleotide sequences encoding the same amino 
acid sequence. Nucleotide sequences encoding a polypeptide include sequences containing 
introns. 

"Expression control sequences" refers to nucleotide sequences that regulate 
the expression of a nucleotide sequence to which they are operatively linked. Expression 
30 control sequences are "operatively linked" to a nucleotide sequence when the expression 
control sequences control and regulate the transcription and, as appropriate, translation of 
the nucleotide sequence. Thus, expression control sequences can include appropriate 
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promoters, enhancers, transcription terminators, a start codon (i.e., ATG) in front of a 
protein-encoding gene, splicing signals for introns, maintenance of the correct reading frame 
of that gene to permit proper translation of the mRNA, and stop codons. 

"Naturally-occurring" as used herein, as applied to an object, refers to the fact that an 
5 object can be found in nature. For example, a polypeptide or polynucleotide sequence that 
is present in an organism (including viruses) that can be isolated from a source in nature and 
which has not been intentionally modified by man in the laboratory is naturally-occurring. 

"Operably linked" refers to a juxtaposition wherein the components so described are 
in a relationship permitting them to function in their intended manner. A control sequence 

1 0 "operably linked" to a coding sequence is ligated in such a way that expression of the 

coding sequence is achieved under conditions compatible with the control sequences, such 
as when the appropriate molecules (e.g., inducers and polymerases) are bound to the control 
or regulatory sequence(s). 

"Control sequence" refers to polynucleotide sequences which are necessary to effect 

1 5 the expression of coding and non-coding sequences to which they are ligated. The nature of 
such control sequences differs depending upon the host organism; in prokaryotes, such 
control sequences generally include promoter, ribosomal binding site, and transcription 
termination sequence; in eukaryotes, generally, such control sequences include promoters 
and transcription termination sequence. The term "control sequences" is intended to 

20 include, at a minimum, components whose presence can influence expression, and can also 
include additional components whose presence is advantageous, for example, leader 
sequences and fusion partner sequences. 

"Isolated polynucleotide" refers a polynucleotide of genomic, cDNA, or synthetic 
origin or some combination there of, which by virtue of its origin the "isolated 

25 polynucleotide" (1) is not associated with the cell in which the "isolated polynucleotide" is 
found in nature, or (2) is operably linked to a polynucleotide which it is not linked to in 
nature. 

"Polynucleotide" refers to a polymeric form of nucleotides of at least 10 bases in 
length, either ribonucleotides or deoxynucleotides or a modified form of either type of 
30 nucleotide. The term includes single and double stranded forms of DNA. 

The term "probe" refers to a substance that specifically binds to another 
substance (a "target"). Probes include, for example, antibodies, nucleic acids, receptors and 
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their ligands. 

"Modulation" refers to the capacity to either enhance or inhibit a functional property 
of biological activity or process (e.g., enzyme activity or receptor binding); such 
enhancement or inhibition may be contingent on the occurrence of a specific event, such as 
5 activation of a signal transduction pathway, and/or may be manifest only in particular cell 
types. 

The term "modulator" refers to a chemical (naturally occurring or non-naturally 
occurring), such as a synthetic molecule (e.g., nucleic acid, protein, non-peptide, or organic 
molecule), or an extract made from biological materials such as bacteria, plants, fungi, or 

1 0 animal (particularly mammalian) cells or tissues. Modulators can be evaluated for potential 
activity as inhibitors or activators (directly or indirectly) of a biological process or processes 
(e.g., agonist, partial antagonist, partial agonist, inverse agonist, antagonist, antineoplastic 
agents, cytotoxic agents, inhibitors of neoplastic transformation or cell proliferation, cell 
proliferation-promoting agents, and the like) by inclusion in screening assays described 

1 5 herein. The activity of a modulator may be known, unknown or partially known. 

The term "test chemical" refers to a chemical to be tested by one or more screening 
method(s)of the invention as a putative modulator. A test chemical is usually not known to 
bind to the target of interest. The term "control test chemical" refers to a chemical known 
to bind to the target (e.g., a known agonist, antagonist, partial agonist or inverse agonist). 

2 0 Usually, various predetermined concentrations of test chemicals are used for screening, such 
as .01 jiM, .1 nM, 1.0 jiM, and 10.0 fiM. 

The term 'target" refers to a biochemical entity involved a biological process. 
Targets are typically proteins that play a useful role in the physiology or biology of an 
organism. A therapeutic chemical binds to target to alter or modulate its function. As used 

25 herein targets can include cell surface receptors, G-proteins, kinases, ion channels, 
phopholipases and other proteins mentioned herein. 

The term "label" refers to a composition detectable by spectroscopic, 
photochemical, biochemical, immunochemical, or chemical means. For example, useful 
labels include 32 P, fluorescent dyes, fluorescent proteins, electron-dense reagents, enzymes 

30 (e.g., as commonly used in an ELISA), biotin, dioxigenin, or haptens and proteins for which 
antisera or monoclonal antibodies are available. For example, polypeptides of this invention 
can be made as detectible labels, by e.g., incorporating a them as into a polypeptide, and 
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used to label antibodies specifically reactive with the polypeptide. A label often generates a 
measurable signal, such as radioactivity, fluorescent light or enzyme activity, which can be 
used to quantitate the amount of bound label. 

The term "nucleic acid probe" refers to a nucleic acid molecule that binds to 
5 a specific sequence or sub-sequence of another nucleic acid molecule. A probe is preferably 
a nucleic acid molecule that binds through complementary base pairing to the full sequence 
or to a sub-sequence of a target nucleic acid. It will be understood that probes may bind 
target sequences lacking complete complementarity with the probe sequence depending 
upon the stringency of the hybridization conditions. Probes are preferably directly labelled 

10 as with isotopes, chromophores, lumiphores, chromogens, fluorescent proteins, or indirectly 
labelled such as with biotin to which a streptavidin complex may later bind. By assaying 
for the presence or absence of the probe, one can detect the presence or absence of the select 
sequence or sub-sequence. 

A "labeled nucleic acid probe" is a nucleic acid probe that is bound, either 

1 5 covalently, through a linker, or through ionic, van der Waals or hydrogen bonds to a label 
such that the presence of the probe may be detected by detecting the presence of the label 
bound to the probe. 

The terms "polypeptide" and "protein" refers to a polymer of amino acid 
residues. The terms apply to amino acid polymers in which one or more amino acid residue 

20 is an artificial chemical analogue of a corresponding naturally occurring amino acid, as well 
as to naturally occurring amino acicl polymers. The term "recombinant protein*' refers to a 
protein that is produced by expression of a nucleotide sequence encoding the amino acid 
sequence of the protein from a recombinant DNA molecule. 

The term "recombinant host cell" refers to a cell that comprises a 

2 5 recombinant nucleic acid molecule. Thus, for example, recombinant host cells can express 
genes that are not found within the native (non-recombinant) form of the cell. 

The terms "isolated" "purified" or "biologically pure" refer to material which 
is substantially or essentially free from components which normally accompany it as found 
in its native state. Purity and homogeneity are typically determined using analytical 

30 chemistry techniques such as polyacrylamide gel electrophoresis or high performance liquid 
chromatography. A protein or nucleic acid molecule which is the predominant protein or 
nucleic acid species present in a preparation is substantially purified. Generally, an isolated 
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protein or nucleic acid molecule will comprise more than 80% of all macromolecular 
species present in the preparation. Preferably, the protein is purified to represent greater 
than 90% of all macromolecular species present. More preferably the protein is purified to 
greater than 95%, and most preferably the protein is purified to essential homogeneity, 
5 wherein other macromolecular species are not detected by conventional techniques. 

The term "naturally-occurring" as applied to an object refers to the fact that 
an object can be found in nature. For example, a polypeptide or polynucleotide sequence 
that is present in an organism (including viruses) that can be isolated from a source in nature 
and which has not been intentionally modified by man in the laboratory is naturally- 
10 occurring. 

The term "antibody" refers to a polypeptide substantially encoded by an 
immunoglobulin gene or immunoglobulin genes, or fragments thereof, which specifically 
bind and recognize an analyte (antigen). The recognized immunoglobulin genes include the 
kappa, lambda, alpha, gamma, delta, epsilon and mu constant region genes, as well as the 

1 5 myriad immunoglobulin variable region genes. Antibodies exist, e.g., as intact 

immunoglobulins or as a number of well characterized fragments produced by digestion 
with various peptidases. This includes, e.g., Fab 1 and F(ab)2 fragments. The term 
"antibody," as used herein, also includes antibody fragments either produced by the 
modification of whole antibodies or those synthesized de novo using recombinant DNA 

20 methodologies. 

The term "immunoassay" refers to an assay that utilizes an antibody to 
specifically bind an analyte. The immunoassay is characterized by the use of specific 
binding properties of a particular antibody to isolate, target, and/or quantify the analyte. 

The term "identical" in the context of two nucleic acid or polypeptide 

2 5 sequences refers to the residues in the two sequences which are the same when aligned for 
maximum correspondence. When percentage of sequence identity^ is used in reference to 
proteins or peptides it is recognized that residue positions which are not identical often 
differ by conservative amino acid substitutions, where amino acids residues are substituted 
for other amino acid residues with similar chemical properties (e.g. charge or 

30 hydrophobicity) and therefore do not change the functional properties of the molecule. 

Where sequences differ in conservative substitutions, the percent sequence identity may be 
adjusted upwards to correct for the conservative nature of the substitution. Means for 
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making this adjustment are well known to those of skill in the art. Typically this involves 
scoring a conservative substitution as a partial rather than a full mismatch, thereby 
increasing the percentage sequence identity. Thus, for example, where an identical amino 
acid is given a score of 1 and a non-conservative substitution is given a score of zero, a 
5 conservative substitution is given a score between zero and I. The scoring of conservative 
substitutions is calculated, e.g., according to known algorithm. See, e.g., Meyers and 
Miller, Computer Applic. Biol. Sci., 4: 11-17 (1988); Smith and Waterman {\9%\)Adv. 
Appl. Math. 2: 482; Needleman and Wunsch (1970) J. Mol. Biol. 48: 443; Pearson and 
Lipman (1988) Proc. Natl. Acad. Sci. USA 85: 2444; Higgins and Sharp (1988) Gene, 73: 

10 237-244 and Higgins and Sharp (1989) CABIOS 5: 151-153; Corpet, et al (1988) Nucleic 
Acids Research 16, 10881-90; Huang, et al. (1992) Computer Applications in the 
Biosciences 8, 1 55-65, and Pearson, et al (1 994) Methods in Molecular Biology 24, 307-3 1 . 
Alignment is also often performed by inspection and manual alignment. 

"Conservatively modified variations" of a particular nucleic acid sequence 

1 5 refers to those nucleic acids which encode identical or essentially identical amino acid 

sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially 
identical sequences. Because of the degeneracy of the genetic code, a large number of 
functionally identical nucleic acids encode any given polypeptide. For instance, the codons 
CGU, CGC, CGA, CGG, AGA, and AGG all encode the amino acid arginine. Thus, at 

2 0 every position where an arginine is specified by a codon, the codon can be altered to any of 
the corresponding codons described without altering the encoded polypeptide. Such nucleic 
acid variations are "silent variations," which are one species of "conservatively modified 
variations." Every nucleic acid sequence herein which encodes a polypeptide also describes 
every possible silent variation. One of skill will recognize that each codon in a nucleic acid 

2 5 (except AUG, which is ordinarily the only codon for methionine) can be modified to yield a 
functionally identical molecule by standard techniques. Accordingly, each "silent variation" 
of a nucleic acid which encodes a polypeptide is implicit in each described sequence. 
Furthermore, one of skill will recognize that individual substitutions, deletions or additions 
which alter, add or delete a single amino acid or a small percentage of amino acids 

30 (typically less than 5%, more typically less than 1 %) in an encoded sequence are 

"conservatively modified variations" where the alterations result in the substitution of an 
amino acid with a chemically similar amino acid. Conservative amino acid substitutions 
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providing functionally similar amino acids are well known in the art. The following six 
groups each contain amino acids that are conservative substitutions for one another: 

1) Alanine (A), Serine (S), Threonine (T); 

2) Aspartic acid (D), Glutamic acid (E); 

3) Asparagine (N), Glutamine (Q); 

4) Arginine (R), Lysine (K); 

5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and 

6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). 

The term "complementary" means that one nucleic acid molecule has the 
sequence of the binding partner of another nucleic acid molecule. Thus, the sequence 5*- 
ATGC-3' is complementary to the sequence 5'-GCAT-3\ 

An amino acid sequence or a nucleotide sequence is "substantially identical" 
or "substantially similar" to a reference sequence if the amino acid sequence or nucleotide 
sequence has at least 80% sequence identity with the reference sequence over a given 
comparison window. Thus, substantially similar sequences include those having, for 
example, at least 85% sequence identity, at least 90% sequence identity, at least 95% 
sequence identity or at least 99% sequence identity. Two sequences that are identical to 
each other are, of course, also substantially identical. 

A subject nucleotide sequence is "substantially complementary" to a 
reference nucleotide sequence if the complement of the subject nucleotide sequence is 
substantially identical to the reference nucleotide sequence. 

The term "stringent conditions" refers to a temperature and ionic conditions 
used in nucleic acid hybridization. Stringent conditions are sequence dependent and are 
different under different environmental parameters. Generally, stringent conditions are 
selected to be about 5 DC to 20DC lower than the thermal melting point (TJ for the specific 
sequence at a defined ionic strength and pH. The T m is the temperature (under defined ionic 
strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched 
probe. 

The term "allelic variants" refers to polymorphic forms of a gene at a 
particular genetic locus, as well as cDNAs derived from mRNA transcripts of the genes and 
the polypeptides encoded by them. 

The term "preferred mammalian codon" refers to the subset of codons from 
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among the set of codons encoding an amino acid that are most frequently used in proteins 
expressed in mammalian cells as chosen from the following list: 

Amino Acid Preferred codons for high level mammalian expression 



5 





vjiy 


GGC GGG 




ulu 






Asp 






Val 


S~*T T/^ Syi Tfi 


10 


Ala 


f> fif* POTT 

GCC.GCU 




Ser 


A t~U~* T T/~ , /~ 1 

AvjC.UCC 




Lys 


A A t* 

AAu 




A 

Asn 


AAC 




% M —A. 

Met 


ATI/"* 

AUu 


15 


He 


AUC 




Thr 


ACC 




Trp 


UGG 




Cys 


UGC 




Tyr 


UAU.UAC 


20 


Leu 


CUG 




Phe 


uuc 




Arg 


CGC.AGG.AGA 




Gin 


CAG 




His 


CAC 


25 


Pro 


CCC 



Fluorescent molecules are useful in fluorescence resonance energy transfer 
("FRET"). FRET involves a donor molecule and an acceptor molecule. To optimize the 
efficiency and detectability of FRET between a donor and acceptor molecule, several factors 

3 0 need to be balanced. The emission spectrum of the donor should overlap as much as 
possible with the excitation spectrum of the acceptor to maximize the overlap integral. 
Also, the quantum yield of the donor moiety and the extinction coefficient of the acceptor 
should likewise be as high as possible to maximize Rq, the distance at which energy transfer 
efficiency is 50%. However, the excitation spectra of the donor and acceptor should overlap 

35 as little as possible so that a wavelength region can be found at which the donor can be 

excited efficiently without directly exciting the acceptor. Fluorescence arising from direct 
excitation of the acceptor is difficult to distinguish from fluorescence arising from FRET. 
Similarly, the emission spectra of the donor and acceptor should overlap as little as possible 
so that the two emissions can be clearly distinguished. High fluorescence quantum yield of 
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the acceptor moiety is desirable if the emission from the acceptor is to be measured either as 
the sole readout or as part of an emission ratio. One factor to be considered in choosing the 
donor and acceptor pair is the efficiency of fluorescence resonance energy transfer between 
them. Preferably, the efficiency of FRET between the donor and acceptor is at least 10%, 
5 more preferably at least 50% and even more preferably at least 80%. 

The term "fluorescent property" refers to the molar extinction coefficient at 
an appropriate excitation wavelength, the fluorescence quantum efficiency, the shape of the 
excitation spectrum or emission spectrum, the excitation wavelength maximum and 
emission wavelength maximum, the ratio of excitation amplitudes at two different 

10 wavelengths, the ratio of emission amplitudes at two different wavelengths, the excited state 
lifetime, or the fluorescence anisotropy. A measurable difference in any one of these 
properties between wild-type Aequorea GFP and the mutant form is useful. A measurable 
difference can be determined by determining the amount of any quantitative fluorescent 
property, e.g., the amount of fluorescence at a particular wavelength, or the integral of 

1 5 fluorescence over the emission spectrum. Determining ratios of excitation amplitude or 
emission amplitude at two different wavelengths ("excitation amplitude ratioing" and 
"emission amplitude ratioing", respectively) are particularly advantageous because the 
ratioing process provides an internal reference and cancels out variations in the absolute 
brightness of the excitation source, the sensitivity of the detector, and light scattering or 

2 0 quenching by the sample. 



II. 



LONG WAVELENGTH ENGINEERED FLUORESCENT PROTEINS 
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A. Fluorescent Proteins 

As used herein, the tenm "fluorescent protein" refers to any protein capable of 
fluorescence when excited with appropriate electromagnetic radiation. This includes 
fluorescent proteins whose amino acid sequences are either naturally occurring or 
engineered (i.e., analogs or mutants). Many cnidarians use green fluorescent proteins 
("GFPs") as energy-transfer acceptors in bioluminescence. A "green fluorescent protein," as 
used herein, is a protein that fluoresces green light. Similarly, "blue fluorescent proteins" 
fluoresce blue light and "red fluorescent proteins" fluoresce red light. GFPs have been 
isolated from the Pacific Northwest jellyfish, Aequorea victoria, the sea pansy, Renilla 
reniformis, and Phialidium gregarium. W.W. Ward et aL, Photochem. Photobiol, 35:803- 
808 (1982); L.D. Levine et aL, Comp. Biochem. Physiol., 72B:77-85 (1982). 

A variety of Aequorea-ieteted fluorescent proteins having useful excitation 
and emission spectra have been engineered by modifying the amino acid sequence of a 
naturally occurring GFP from Aequorea victoria. (D.C. Prasher et aL, Gene y 1 1 1 :229-233 
(1992); R. Heim et aL, Proc. Natl Acad. ScL, USA, 91:12501-04 (1994); U.S. patent 
application 08/337,915, filed November 10, 1994; International application 
PCT/US95/I4692, filed 11/10/95.) 

As used herein, a fluorescent protein is an "/ie^worai-related fluorescent 
protein" if any contiguous sequence of 150 amino acids of the fluorescent protein has at 
least 85% sequence identity with an amino acid sequence, either contiguous or non- 
contiguous, from the 238 amino-acid wild-type Aequorea green fluorescent protein of Fig. 3 
(SEQ ID NO:2). More preferably, a fluorescent protein is an Aequorea-relaltd fluorescent 
protein if any contiguous sequence of 200 amino acids of the fluorescent protein has at least 
95% sequence identity with an amino acid sequence, either contiguous or non-contiguous, 
from the wild type Aequorea green fluorescent protein of Fig. 3 (SEQ ID NO:2). Similarly, 
the fluorescent protein may be related to Renilla or Phialidium wild-type fluorescent 
proteins using the same standards. 

Aequorea-relaled fluorescent proteins include, for example and without 
limitation, wild-type (native) Aequorea victoria GFP (D.C. Prasher et aL, "Primary structure 
of the Aequorea victoria green fluorescent protein," Gene, (1992) 1 1 1 :229-33), whose 
nucleotide sequence (SEQ ID NO:l) and deduced amino acid sequence (SEQ ID NO:2) are 
presented in Fig. 3; allelic variants of this sequence, e.g., Q80R, which has the glutamine 
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residue at position 80 substituted with arginine (M. Chalfie et al., Science, (1994) 263:802- 
805); those engineered ^e^worea-related fluorescent proteins described herein, e.g., in Table 
A or Table F, variants that include one or more folding mutations and fragments of these 
proteins that are fluorescent, such as Aequorea green fluorescent protein from which the two 
5 amino-terminal amino acids have been removed. Several of these contain different aromatic 
amino acids within the central chromophore and fluoresce at a distinctly shorter wavelength 
than wild type species. For example, engineered proteins P4 and P4-3 contain (in addition 
to other mutations) the substitution Y66H, whereas W2 and W7 contain (in addition to other 
mutations) Y66W. Other mutations both close to the chromophore region of the protein and 
1 0 remote from it in primary sequence may affect the spectral properties of GFP and are listed 
in the first part of the table below. 

TABLE A 



Excitation Emission Extinct. Coeff. 



Clone 


Mutationfs) 


max fnm) 


max fnm) 


(M'cm') 


yield 


Wild 


None 


395 (475) 


508 


21,000(7,150) 


0.77 


type 












P4 


Y66H 


383 


447 


13,500 


0.21 


P4-3 


Y66H 


381 


445 


14,000 


0.38 




Y145F 










W7 


Y66W 


433 (453) 


475 (501) 


18,000(17,100) 


0.67 




N146I 












M153T 












V163A 












N212K 










W2 


Y66W 


432(453) 


480 


10,000(9,600) 


0.72 




1 123 V 










Y145H 












H148R 












M153T 












V163A 












N212K 










S65T 


S65T 


489 


511 


39,200 


0.68 


P4-1 


S65T 


504(396) 


514 


14,500 (8,600) 


0.53 




M153A 
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K238E 






S65A 


S65A 


471 


504 


S65C 


S65C 


479 


507 


S65L 


S65L 


484 


510 


Y66F 


Y66F 


360 


442 


Y66W 


Y66W 


458 


480 



Additional mutations in Aequorea-reldXed fluorescent proteins, referred to as 
"folding mutations," improve the ability of fluorescent proteins to fold at higher 
temperatures, and to be more fluorescent when expressed in mammalian cells, but have little 
or no effect on the peak wavelengths of excitation and emission. It should be noted that 
these may be combined with mutations that influence the spectral properties of GFP to 
produce proteins with altered spectral and folding properties. Folding mutations include: 
F64L, V68L, S72A, and also T44A, F99S, Y145F, N146I, M153T or A, V163A, I167T, 
S175G, S205Tand N212K. 

As used herein, the term "loop domain" refers to an amino acid sequence of 
an Aecuorea-rdzXzd fluorescent protein that connects the amino acids involved in the 
secondary structure of the eleven strands of the □ -barrel or the central □ -helix (residues 56- 
72) (see Fig. lAand IB). 

As used herein, the "fluorescent protein moiety" of a fluorescent protein is 
that portion of the amino acid sequence of a fluorescent protein which, when the amino acid 
sequence of the fluorescent protein substrate is optimally aligned with the amino acid 
sequence of a naturally occurring fluorescent protein, lies between the amino terminal and 
carboxy terminal amino acids, inclusive, of the amino acid sequence of the naturally 
occurring fluorescent protein. 

It has been found that fluorescent proteins can be genetically fused to other 
target proteins and used as markers to identify the location and amount of the target protein 
produced. Accordingly, this invention provides fusion proteins comprising a fluorescent 
protein moiety and additional amino acid sequences. Such sequences can be, for example, 
up to about 15, up to about 50, up to about 150 or up to about 1000 amino acids long. The 
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fusion proteins possess the ability to fluoresce when excited by electromagnetic radiation. 
In one embodiment, the fusion protein comprises a polyhistidine tag to aid in purification of 
the protein. 



5 B. Use Of The Crystal Structure Of Green Fluorescent Protein To Design 

Mutants Having Altered Fluorescent Characteristics 
Using X-ray crystallography and computer processing, we have created a 
model of the crystal structure of Aequorea green fluorescent protein showing the relative 
location of the atoms in the molecule. This information is useful in identifying amino acids 
1 0 whose substitution alters fluorescent properties of the protein. 

Fluorescent characteristics of Aequorea-relaXed fluorescent proteins depend, 
in part, on the electronic environment of the chromophore. In general, amino acids that are 
within about 0.5 nm of the chromophore influence the electronic environment of the 
chromophore. Therefore, substitution of such amino acids can produce fluorescent proteins 
1 5 with altered fluorescent characteristics. In the excited state, electron density tends to shift 
from the phenolate towards the carbonyl end of the chromophore. Therefore, placement of 
increasing positive charge near the carbonyl end of the chromophore tends to decrease the 
energy of the excited state and cause a red-shift in the absorbance and emission wavelength 
maximum of the protein. Decreasing positive charge near the carbonyl end of the 
2 0 chromophore tends to have the opposte effect, causing a blue-shift in the protein's 
wavelengths. 

Amino acids with charged (ionized D, E, K, and R), dipolar (H, N, Q, S, T, 
and uncharged D, E and K), and polarizable side groups (e.g., C, F, H, M, W and Y) are 
useful for altering the electronic environment of the chromophore, especially when they 

2 5 substitute an amino acid with an uncharged, nonpolar or non-polarizable side chain. In 

general, amino acids with polarizable side groups alter the electronic environment least, and; 
consequently, are expected to cause a comparatively smaller change in a fluorescent 
property. Amino acids with charged side groups alter the environment most, and, 
consequently, are •expected to cause a comparatively larger change in a fluorescent property. 

3 0 However, amino acids with charged side groups are more likely to disrupt the structure of 

the protein and to prevent proper folding if buried next to the chromophore without any 
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additional solvation or salt bridging. Therefore charged amino acids are most likely to be 
tolerated and to give useful effects when they replace other charged or highly polar amino 
acids that are already solvated or involved in salt bridges. In certain cases, where 
substitution with a polarizable amino acid is chosen, the structure of the protein may make 
selection of a larger amino acid, e.g., W, less appropriate. Alternatively, positions occupied 
by amino acids with charged or polar side groups that are unfavorably oriented may be 
substituted with amino acids that have less charged or polar side groups. In another 
alternative, an amino acid whose side group has a dipole oriented in one direction in the 
protein can be substituted with an amino acid having a dipole oriented in a different 
direction. 



nm from the chromophore whose substitution can result in altered fluorescent 
characteristics. The table indicates, underlined, preferred amino acid substitutions at the 
indicated location to alter a fluorescent characteristic of the protein. In order to introduce 
such substitutions, the table also provides codons for primers used in site- directed 
mutagenesis involving amplification. These primers have been selected to encode 
economically the preferred amino acids, but they encode other amino acids as well, as 
indicated, or even a stop codon, denoted by Z. In introducing substitutions using such 
degenerate primers the most efficient strategy is to screen the collection to identify mutants 
with the desired properties and then sequence their DNA to find out which of the possible 
substitutions is responsible. Codons are shown in double-stranded form with sense strand 
above, antisense strand below. In nucleic acid sequences, R*(A or g); Y=(C or T); M=(A or 



C); K=(g or T); S-(g or C); W=(A or T); H=(A, T, or C); B-(g, T, or C); V=(g, A, or C); 
D=(g, A, or T); N=(A, C, g, or T). 



More particularly, Table B lists several amino acids located within about 0.5 



TABLE B 



Original position and presumed role 



Change to 



Codon 



L42 Aliphatic residue near C=N of chromophore 



CFHLQRWYZ 5'YDS 3' 
3TUiS5' 



V61 Aliphatic residue near central -CH* of chromophore FYHCL R 



YDC 



RHg 
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T62 Almost directly above center of chromophore bridge AVFS KYC 



MRg 



5 DEHKNQ VAS 

BTS 

FYHCL R YDC 
RHg 

10 V68 Alipharic residue near carbonyl and G67 FYHL YWC 

RWg 



N121 Near C-N site of ring closure between T65 and G67 CFHLQRWYZ YDS 

15 



RHS 



Y145 Packs near tyrosine ring of chromophore WCFL TKS 

AMS 

DEHNKQ VAS 

20 BTS 

H148 H-bonds to phenolate oxygen FYNI WWC 

WWg 

25 KQR MRg 

KYC 

VI 50 Aliphatic residue near tyrosine ring of chromophore FYHL YWC 

RWg 

30 

F165 Packs near tyrosine ring CHOKWYZ YRS 

RYS 

1 1 67 Aliphatic residue near phenolate; 1 1 67T has effects FYHL YWC 

35 RWg 

T203 H-bonds to phenolic oxygen of chromophore FHLQRWYZ YDS 

RHS 

40 £222 Protonation regulates ionization of chromophore HKNQ MAS 

KTS 

Examples of amino acids with polar side groups that can be substituted with 
polarizable side groups include, for example, those in Table C. 
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Original position and presumed role 

Q69 Terminates chain of H-bonding waters 

Q94 H-bonds to carbony 1 terminus of chromophore 

Ql 83 Bridges Arg96 and center of chromophore bridge 



Change to Codon 



N 1 85 Part of H-bond network near carbony 1 of chromophore 



KREG 



HY 



EK 



DEHNKQ 



RRg 
YYC 



DEHKN Q VAS 
BTS 



YAC 
RTG 

RAg 
YTC 

VAS 
BTS 



In another embodiment, an amino acid that is close to a second amino acid 
within about 0.5 nm of the chromophore can, upon substitution, alter the electronic 
properties of the second amino acid, in turn altering the electronic environment of the 
chromphore. Table D presents two such amino acids. The amino acids, L220 and V224, 
are close to E222 and oriented in the same direction in the □ pleated sheet. 



TABLED 

Original position and presumed role 

L220 Packs next to Glu222; to make GFP pH sensitive 

V224 Packs next to Glu222 ; to make GFP pH sensitive 



Change to Codon 
HKNPQT 



HKNPQT 



MMS 
KKS 

MMS 
KKS 



CFHLQRWYZ YDS 
RHS 
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One embodiment of the invention includes a nucleic acid molecule comprising a 
nucleotide sequence encoding a functional engineered fluorescent protein whose amino acid 
sequence is substantially identical to the amino acid sequence of Aequorea green fluorescent 
protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 by at least a substitution at 
5 Q69, wherein the functional engineered fluorescent protein has a different fluorescent 
property than Aequorea green fluorescent protein. Preferably, the substitution at Q69 is 
selected from the group of K, R, E and G. The Q69 substitution can be combined with other 
mutations to improve the properties of the protein, such as a functional mutation at S65. 
One embodiment of the invention includes a nucleic acid molecule comprising a 

10 nucleotide sequence encoding a functional engineered fluorescent protein whose amino acid 
sequence is substantially identical to the amino acid sequence of Aequorea green fluorescent 
protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 by at least a substitution at 
E222, but not including E222G, wherein the functional engineered fluorescent protein has a 
different fluorescent property Ham Aequorea green fluorescent protein. Preferably, the 

1 5 substitution at E222 is selected from the group of N and Q. The E222 substitution can be 
combined with other mutations to improve the properties of the protein, such as a functional 
mutation at F64. 

One embodiment of the invention includes a nucleic acid molecule comprising a 
nucleotide sequence encoding a functional engineered fluorescent protein whose amino acid 
2 0 sequence is substantially identical to the amino acid sequence of Aequorea green fluorescent 
protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 by at least a substitution at 
Y145, wherein the functional engineered fluorescent protein has a different fluorescent 
property than Aequorea green fluorescent protein. 

Preferably, the substitution at Y145 is selected from the group of W, C, F, L, E, H, K and Q. 
2 5 The Y 1 45 substitution can be combined with other mutations to improve the properties of 
the protein, such as a Y66. 
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The invention also includes computer related embodiments, including computational 
methods of using the crystal coordinates for designing new fluorescent protein mutations 
and devices for storing the crystal data, including coordinates. For instance the 
invention includes a device comprising a storage device and, stored in the device, at least 10 
5 atomic coordinates selected from the atomic coordinates listed in Figs. 5-1 to 5-28. More 
coordinates can be storage depending of the complexity of the calculations or the objective 
of using the coordinates (e.g. about 100, 1,000, or more coordinates). For example, larger 
numbers of coordinates will be desirable for more detailed representations of fluorescent 
protein structure. Typically, the storage device is a computer readable device that stores 
1 0 code that it receives as input the atomic coordinates. Although, other storage meand as 

known in the art are contemplated. The computer readable device can be a floppy disk or a 
hard drive. 
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C. Production Of Long Wavelength Engineered Fluorescent Proteins 

Recombinant production of a fluorescent protein involves expressing a 
nucleic acid molecule having sequences that encode the protein. 

In one embodiment, the nucleic acid encodes a fusion protein in which a 
5 single polypeptide includes the fluorescent protein moiety within a longer polypeptide. The 
longer polypeptide can include a second functional protein, such as FRET partner or a 
protein having a second function (e.g., an enzyme, antibody or other binding protein). 
Nucleic acids that encode fluorescent proteins are useful as starting materials. 

The fluorescent proteins can be produced as fusion proteins by recombinant 
1 0 DNA technology. Recombinant production of fluorescent proteins involves expressing 

nucleic acids having sequences that encode the proteins. Nucleic acids encoding fluorescent 
proteins can be obtained by methods known in the art. Fluorescent proteins can be made by 
site-specific mutagenesis of other nucleic acids encoding fluorescent proteins, or by random 
mutagenesis caused by increasing the eiror rate of PCR of the original polynucleotide with 
15 0.1 mM MnCK and unbalanced nucleotide concentrations. See, e.g., U.S. patent application 
08/337,915, filed November 10, 1994 or International application PCT/US95/14692, filed 
1 1/10/95. The nucleic acid encoding a green fluorescent protein can be isolated by 
polymerase chain reaction ox cDNA from A victoria using primers based on the DNA 
sequence of A. Victoria green fluorescent protein, as presented in Fig. 3. PCR methods are 
20 described in, for example, U.S. Pat. No. 4,683,195; Muilis et at. (1987) Cold Spring Harbor 
Symp. Quant, BioL 5 1 :263; and Erlich, ed., PCR Technology, (Stockton Press, NY, 1 989). 

The construction of expression vectors and the expression of genes in 
transfected cells involves the use of molecular cloning techniques also well known in the 
art. Sambrook et al., Molecular Cloning ~ A Laboratory Manual, Cold Spring Haibor 
2 5 Laboratory, Cold Spring Harbor, NY, (1989) and Current Protocols in Molecular Biology, 
F.M. Ausubel et al., eds., (Current Protocols, a joint venture between Greene Publishing 
Associates, Inc. and John Wiley & Sons, Inc.). The expression vector can be adapted for 
function in prokaryotes or eukaryotes by inclusion of appropriate promoters, replication 
sequences, markers, etc. 

30 Nucleic acids used to transfect ceils with sequences coding for expression of the 

polypeptide of interest generally will be in the form of an expression vector including 
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expression control sequences operatively linked to a nucleotide sequence coding for 
expression of the polypeptide. As used, the term "nucleotide sequence coding for 
expression of 1 a polypeptide refers to a sequence that, upon transcription and translation of 
mRNA, produces the polypeptide. This can include sequences containing, e.g., introns. 
Expression control sequences are operatively linked to a nucleic acid sequence when the 
expression control sequences control and regulate the transcription and, as appropriate, 
translation of the nucleic acid sequence. Thus, expression control sequences can include 
appropriate promoters, enhancers, transcription terminators, a start codon (z.e, ATG) in 
front of a protein-encoding gene, splicing signals for introns, maintenance of the correct 
reading frame of that gene to permit proper translation of the mRNA, and stop codons. 

Methods which are well known to those skilled in the art can be used to 
construct expression vectors containing the fluorescent protein coding sequence and 
appropriate transcriptional/translational control signals. These methods include in vitro 
recombinant DNA techniques, synthetic techniques and in vivo recombination/genetic 
recombination. (See, for example, the techniques described in Maniatis ? et ai, Molecular 
Cloning A Laboratory Manual, Cold Spring Harbor Laboratory, N.Y., 1989). 

Transformation of a host cell with recombinant DNA may be carried out by 
conventional techniques as are well known to those skilled in the art. Where the host is 
prokar -otic, such as £. coli, competent cells which are capable of DNA uptake can be 
prepared from cells harvested after exponential growth phase and subsequemly treated by 
the CaCU method by procedures well known in the art. Alternatively, MgCU or RbCl can 
be used. Transformation can also be performed after forming a protoplast of the host cell or 
by electroporation. 

When the host is a eukaryote, such methods of transfection of DNA as calcium 
phosphate co-precipitates, conventional mechanical procedures such as microinjection, 
electroporation, insertion of a plasmid encased in liposomes, or virus vectors may be used. 
Eukaryotic cells can also be cotransfected with DNA sequences encoding the fusion 
polypeptide of the invention, and a second foreign DNA molecule encoding a selectable 
phenotype, such as the herpes simplex thymidine kinase gene. Another method is to use a 
eukaryotic viral vector, such as simian virus 40 (SV40) or bovine papilloma virus, to 
transiently infect or transform eukaryotic cells and express the protein. (Eukaryotic Viral 
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Vectors, Cold Spring Harbor Laboratory, Gluzman ed., 1982). Preferably, a eukaryotic host 
is utilized as the host cell as described herein. 

Techniques for the isolation and purification of either microbially or eukaryotically 
expressed polypeptides of the invention may be by any conventional means such as, for 
5 example, preparative chromatographic separations and immunological separations such as 
those involving the use of monoclonal or polyclonal antibodies or antigen. 

In one embodiment recombinant fluorescent proteins can be produced by expression 
of nucleic acid encoding for the protein in E. coli. Aequorea-refoted fluorescent proteins are 
best expressed by cells cultured between about 15 □ C and 30D C but higher temperatures 

10 (e.g. 37 □ C) are possible. After synthesis, these enzymes are stable at higher temperatures 
(e.g., 37 □ C) and can be used in assays at those temperatures. 

A variety of host-expression vector systems may be utilized to express 
fluorescent protein coding sequence. These include but are not limited to microorganisms 
such as bacteria transformed with recombinant bacteriophage DNA, plasmid DNA or 

1 5 cosmid DNA expression vectors containing a fluorescent protein coding sequence; yeast 
transformed with recombinant yeast expression vectors containing the fluorescent protein 
coding sequence; plant cell systems infected with recombinant virus expression vectors 
(e.g., cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV) or transformed with 
recombinant plasmid expression vectors (e.g., Ti plasmid) containing a fluorescent protein 

2 0 ceding sequence; insect cell systems infected with recombinant virus expression vectors 

(e.g., baculovirus) containing a fluorescent protein coding sequence; or animal cell systems 
infected with recombinant virus expression vectors (e.g., retroviruses, adenovirus, vaccinia 
virus) containing a fluorescent protein coding sequence, or transformed animal cell systems 
engineered for stable expression. 

2 5 Depending on the host/vector system utilized, any of a number of suitable 

transcription and translation elements, including constitutive and inducible promoters, 
transcription enhancer elements, transcription terminators, etc. may be used in the 
expression vector (see, e.g., Bitter, et a/., Methods in Enzymology 153:516-544, 1987). For 
example, when cloning in bacterial systems, inducible promoters such as pL of 

3 0 bacteriophage □, piac, ptrp, ptac (ptrp-lac hybrid promoter) and the like may be used. 

When cloning in mammalian cell systems, promoters derived from the genome of 
mammalian cells (e.g., metallothionein promoter) or from mammalian viruses (e.g., the 
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retrovirus long terminal repeat; the adenovirus late promoter; the vaccinia vims 7.5K 
promoter) may be used. Promoters produced by recombinant DNA or synthetic techniques 
may also be used to provide for transcription of the inserted fluorescent protein coding 
sequence. 

5 In bacterial systems a number of expression vectors may be advantageously 

selected depending upon the use intended for the fluorescent protein expressed. For 
example, when large quantities of the fluorescent protein are to be produced, vectors which 
direct the expression of high levels of fusion protein products that are readily purified may 
be desirable. Those which are engineered to contain a cleavage site to aid in recovering 

1 0 fluorescent protein are preferred. 

In yeast, a number of vectors containing constitutive or inducible promoters may 
be used. For a review see, Current Protocols in Molecular Biology, Vol. 2, Ed. Ausubel, et 
ai t Greene Publish. Assoc. & Wiley lnterscience, Ch. 13, 1988; Grant, et aL, Expression 
and Secretion Vectors for Yeast, in Methods in Enzymology, Eds. Wu & Grossman, 31987, 

1 5 Acad. Press, N.Y., Vol 1 53, pp.5 16-544, 1 987; Glover, DNA Cloning, Vol. II, IRL Press, 
Wash., D.C., Ch. 3, 1986; and Bitter, Heterologous Gene Expression in Yeast, Methods in 
Enzynology, Eds. Berger & Kimmel, Acad Press, N.Y., Vol. 152, pp. 673-684, 1987; and 
The Molecular Biology of the Yeast Saccharomyces, Eds. Strathern et aL, Cold Spring 
Harbor Press, Vols. I and II, 1982. A constitutive yeast promoter such as ADH or LEU2 or 

20 an inducible promoter such as GAL may be used (Cloning in Yeast, Ch. 3, R. Rothstein In: 
DNA Cloning Vol.1 1, A Practical Approach, Ed. DM Glover, IRL Press, Wash., D.C., 
1 986). Alternatively, vectors may be used which promote integration of foreign DNA 
sequences into the yeast chromosome. 

In cases where plant expression vectors are used, the expression of a fluorescent 

2 5 protein coding sequence may be driven by any of a number of promoters. For example, 
viral promoters such as the 35S RNA and 19S RNA promoters of CaMV (Brisson, et aL, 
Nature 310:511-514,1 984), or the coat protein promoter to TMV (Takamatsu, et aL , EMBO 
J. 6:307-31 1, 1987) may be used; alternatively, plant promoters such as the small subunit of 
RUBISCO (Coruzzi, et aL, 1984, EMBO J. 3:1671-1680; Broglie, et aL, Science 224:838- 

30 843, 1984); or heat shock promoters, e.g., soybean hspl 7.5-E or hspl 7.3-B (Gurley, et aL, 
MoL Cell. BioL 6:559-565, 1986) may be used. These constructs can be introduced into 
plant cells using Ti plasmids, Ri plasmids, plant virus vectors, direct DNA transformation, 
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microinjection, electroporation, etc. For reviews of such techniques see, for example, 
Weissbach & Weissbach, Methods for Plant Molecular Biology, Academic Press, NY, 
Section VIII, pp. 421-463, 1988; and Grierson & Corey, Plant Molecular Biology, 2d Ed., 
Blackie, London, Ch. 7-9, 1988. 
5 An alternative expression system which could be used to express fluorescent 

protein is an insect system. In one such system, Autographa californica nuclear poly- 
hedrosis virus (AcNPV) is used as a vector to express foreign genes. The virus grows in 
Spodoptera frugiperda cells. The fluorescent protein coding sequence may be cloned into 
non-essential regions (for example, the polyhedrin gene) of the virus and placed under 

1 0 control of an AcNPV promoter (for example the polyhedrin promoter). Successful insertion 
of the fluorescent protein coding sequence will result in inactivation of the polyhedrin gene 
and production of non-occluded recombinant virus (i.e., virus lacking the proteinaceous coat 
coded for by the polyhedrin gene). These recombinant viruses are then used to infect 
Spodoptera frugiperda cells in which the inserted gene is expressed, see Smith, et aL, J. 

15 Viol 46:584, 1983; Smith, U.S. Patent No. 4,215,051. 

Eukaryotic systems, and preferably mammalian expression systems, allow for 
proper post-transiational modifications of expressed mammalian proteins to occur. 
Eukaryotic cells which possess the cellular machinery for proper processing of the primary 
transcript, glycosylation, phosphorylation, and, advantageously secretion of the gene 

2 0 product should be used as host cells for the expression of fluorescent protein. Such host 
cell lines may include but are not limited to CHO, VERO, BHK, KeLa, COS, MDCK, 
Jurkat, HEK-293, and WI38. 

Mammalian cell systems which utilize recombinant viruses or viral elements to 
direct expression may be engineered. For example, when using adenovirus expression 

2 5 vectors, the fluorescent protein coding sequence may be ligated to an adenovirus 
transcription/translation control complex, e.g, the late promoter and tripartite leader 
sequence. This chimeric gene may then be inserted in the adenovirus genome by in vitro or 
in vivo recombination. Insertion in a non-essential region of the viral genome (e.g., region 
El or E3) will result in a recombinant virus that is viable and capable of expressing the 

30 fluorescent protein in infected hosts (e.g., see Logan & Shenk, Proc. Natl Acad. Sci. USA, 
81:3655-3659, 1984). Alternatively, the vaccinia virus 7.5K promoter may be used, (e.g., 
see, Mackett, et aL, Proc. Natl. Acad. ScL USA ,79: 7415-7419, 1982; Mackett, et aL, J. 
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Virol. 49: 857-864, 1984; Panicali, et al y Proc, Natl Acad. ScL USA 79: 4927-4931, 1982). 
Of particular interest are vectors based on bovine papilloma virus which have the ability to 
replicate as extrachromosomal elements (Sarver, et a/., Mol Cell Biol. 1: 486, 1981). 
Shortly after entry of this DNA into mouse cells, the plasmid replicates to about 100 to 200 
5 copies per celL Transcription of the inserted cDNA does not require integration of the 
plasmid into the host's chromosome, thereby yielding a high level of expression. These 
vectors can be used for stable expression by including a selectable marker in the plasmid, 
such as the neo gene. Alternatively, the retroviral genome can be modified for use as a 
vector capable of introducing and directing the expression of the fluorescent protein gene in 

10 host cells (Cone & Mulligan, Proc. Natl Acad. Set USA, 81:6349-6353, 1984). High level 
expression may also be achieved using inducible promoters, including, but not limited to, 
the metallothionine IIA promoter and heat shock promoters. 

The invention can also include a localization sequence, such as a nuclear 
localization sequence, an endoplasmic reticulum localization sequence, a peroxisome 

1 5 localization sequence, a mitochondrial localization sequence, or a localized protein. 

Localization sequences can be targeting sequences which are described, for example, in 
"Protein Targeting", chapter 35 of Stryer, L., Biochemistry (4th ed.). W.H. Freeman, 1995. 
The localization sequence can aiso be a localized protein. Some important localization 
sequences include those targeting the nucleus (KKKRK), mitochondrion (amino terminal 

2 0 MLRTSSLFTRRVQPSLFRNILRLQST-), endoplasmic reticulum (KDEL at C-terminus, 
assuming a signal sequence present at N-terminus), peroxisome (SKF at C-terminus), 
prenylation or insertion into plasma membrane (CaaX, CC, CXC, or CCXX at C-terminus), 
cytoplasmic side of plasma membrane (fusion to SNAP-25), or the Golgi apparatus (fusion 
to farin). 

2 5 For long-term, high-yield production of recombinant proteins, stable expression 

is preferred. Rather than using expression vectors which contain viral origins of replication, 
host cells can be transformed with the fluorescent protein cDNA controlled by appropriate 
expression control elements (e.g., promoter, enhancer, sequences, transcription terminators, 
polyadenylation sites, etc.), and a selectable marker. The selectable marker in the 

30 recombinant plasmid confers resistance to the selection and allows cells to stably integrate 
the plasmid into their chromosomes and grow to form foci which in turn can be cloned and 
expanded into cell lines. For example, following the introduction of foreign DNA, 
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engineered cells may be allowed to grow for 1-2 days in an enriched media, and then are 
switched to a selective media. A number of selection systems may be used, including but 
not limited to the herpes simplex virus thymidine kinase (Wigler, et aL, Cell, 1 1 : 223, 
1977), hypoxanthine-guanine phosphoribosyltransferase (Szybalska & Szybalski, Proc. 
5 Natl. Acad. Sci. USA, 48:2026, 1962), and adenine phosphoribosyltransferase (Lowy, et aL, 
Cell, 22: 817, 1980) genes can be employed in tk\ hgprt" or aprf cells respectively. Also, 
antimetabolite resistance can be used as the basis of selection for dhfr, which confers 
resistance to methotrexate (Wigler, et aL, Proc. Natl. Acad. ScL USA, 11: 3567, 1980; 
O'Hare, et aL, Proc. Natl. Acad. ScL USA t 8: 1527, 1981); gpt, which confers resistance to 

10 mycophenolic acid (Mulligan & Berg, Proc. Natl. Acad ScL USA, 78: 2072, 1981; neo, 
which confers resistance to the aminoglycoside G-418 (Colberre-Garapin, et aL, J. MoL 
Biol., 150:1, 1981); and hygro, which confers resistance to hygromycin (Santerre, et aL, 
Gene, 30: 147, 1984) genes. Recently, additional selectable genes have been described, 
namely trpB, which allows cells to utilize indole in place of tryptophan; hisD, which allows 

15 cells to utilize histinol in place of histidine (Hartman & Mulligan, Proc. Natl. Acad. ScL 
USA, 85:8047, 1988); and ODC (ornithine decarboxylase) which confers resistance to the 
ornithine decarboxylase inhibitor, 2-(difluoromethyl)-DL-ornithine, DFMO (McConlogue 
L., In: Current Communications in Molecular Biology, Cold Spring Harbor Laboratory, ed., 
1987). 

20 DNA sequences encoding the fluorescence protein polypeptide of the invention 

can be expressed in vitro by DNA transfer into a suitable host cell. "Host cells" are cells in 
which a vector can be propagated and its DNA expressed. The term also includes any 
progeny of the subject host cell. It is understood that all progeny may not be identical to the 
parental cell since there may be mutations that occur during replication. However, such 

2 5 progeny are included when the term "host cell" is used. Methods of stable transfer, in other 

words when the foreign DNA is continuously maintained in the host, are known in the art. 

The expression vector can be transfected into a host cell for expression of the 
recombinant nucleic acid. Host cells can be selected for high levels of expression in order 
to purify the fluorescent protein fusion protein. E. coli is useful for this purpose. 

3 0 Alternatively, the host cell can be a prokaryotic or eukaryotic cell selected to study the 

activity of an enzyme produced by the cell. In this case, the linker peptide is selected to 
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include an amino acid sequence recognized by the protease. The cell can be, e.g., a cultured 
cell or a cell in vivo. 

A primary advantage of fluorescent protein fusion proteins is that they are 
prepared by normal protein biosynthesis, thus completely avoiding organic synthesis and 
5 the requirement for customized unnatural amino acid analogs. The constructs can be 

expressed in E. coli in large scale for in vitro assays. Purification from bacteria is simplified 
when the sequences include polyhistidine tags for one-step purification by nickel-chelate 
chromatography. Alternatively, the substrates can be expressed directly in a desired host 
cell for assays in situ. 

10 In another embodiment, the invention provides a transgenic non-human animal 

that expresses a nucleic acid sequence which encodes the fluorescent protein. 

The "non-human animals" of the invention comprise any non-human animal 
having nucleic acid sequence which encodes a fluorescent protein. Such non-human animals 
include vertebrates such as rodents, non-human primates, sheep, dog, cow, pig, amphibians, 

15 and reptiles. Preferred non-human animals are selected from the rodent family including rat 
and mouse, most preferably mouse. The "transgenic non-human animals" of the invention 
r.re produced by introducing "transgenes" into the germline of the non-human animal. 
Embryonal target cells at various developmental stages can be used to introduce transgenes. 
Different methods are used depending on the stage of development of the embryonal target 

2 0 cell. The zygote is the best target for micro-injection. In the mouse, the male pronucleus 

reaches the size of approximately 20 micrometers in diameter which allows reproducible 
injection of 1 -2 pi of DNA solution. The use of zygotes as a target for gene transfer has a 
major advantage in that in most cases the injected DNA will be incorporated into the host 
gene before the first cleavage (Brinster et al, Proc. Natl Acad. Set USA 82:4438-4442, 
25 1 985). As a consequence, all cells of the transgenic non-human animal will carry the 

incorporated transgene. This will in general also be reflected in the efficient transmission of 
the transgene to offspring of the founder since 50% of the germ cells will harbor the 
transgene. Microinjection of zygotes is the preferred method for incorporating transgenes in 
practicing the invention. 

3 0 The term "transgenic" is used to describe an animal which includes exogenous 

genetic material within all of its cells. A "transgenic" animal can be produced by cross- 
breeding two chimeric animals which include exogenous genetic material within cells used 
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in reproduction. Twenty-five percent of the resulting offspring will be transgenic i.e., 
animals which include the exogenous genetic material within all of their cells in both 
alleles. 50% of the resulting animals will include the exogenous genetic material within one 
allele and 25% will include no exogenous genetic material. 
5 Retroviral infection can also be used to introduce transgene into a non-human 

animal. The developing non-human embryo can be cultured in vitro to the blastocyst stage. 
During this time, the blastomeres can be targets for retro viral infection (Jaenich, R., Proc. 
Natl. Acad. Sci USA 73:1260-1264, 1976). Efficient infection of the blastomeres is obtained 
by enzymatic treatment to remove the zona pellucida (Hogan, et al. (1986) in Manipulating 

1 0 the Mouse Embryo, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.). The 
viral vector system used to introduce the transgene is typically a replication-defective retro 
vims carrying the transgene (Jahner, et aL, Proc. Natl Acad. Sci. USA 82:6927-6931, 1985; 
Van der Putten, et al. f Proc. Natl Acad. Sci USA 82:6148-6152, 1985). Transfection is 
easily and efficiently obtained by culturing the blastomeres on a monolayer of 

1 5 virus-producing cells (Van der Putten, supra; Stewart, et aL, EMBOJ. 6:383-388, 1987). 
Alternatively, infection can be performed at a later stage. Virus or virus-producing cells can 
be injected into the blastocoele (D. Jahnsr et aL, Nature 298:623-628, 19S2). Most of the 
founders will be mosaic for the transgene since incorporation occurs only in a subset of the 
cells which formed the transgenic nonhuman animal. Further, the founder may contain 

2 0 various retro viral insertions of the transgene at different positions in the genome which 
generally will segregate in the offspring. In addition, it is also possible to introduce 
transgenes into the germ line, albeit with low efficiency, by intrauterine retro viral infection 
of the midgestation embryo (D. Jahner et aL, supra). 

A third type of target cell for transgene introduction is the embryonal stem cell 

2 5 (ES). ES cells are obtained from pre-implantation embryos cultured in vitro and fused with 
embryos (M. J. Evans et aL Nature 292:154-156, 1981; M.O. Bradley et aL, Nature 309: 
255-258, 1984; Gossler, et aL, Proc. Natl. Acad Sci USA 83: 9065-9069, 1986; and 
Robertson et aL, Nature 322:445-448, 1986). Transgenes can be efficiently introduced into 
the ES cells by DNA transfection or by retro virus-mediated transduction. Such transformed 

30 ES cells can thereafter be combined with blastocysts from a nonhuman animal. The ES cells 
thereafter colonize the embryo and contribute to the germ line of the resulting chimeric 
animal. (For review see Jaenisch, R., Science 240: 1468-1474, 1988). 
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"Transformed" means a cell into which (or into an ancestor of which) has been 
introduced, by means of recombinant nucleic acid techniques, a heterologous nucleic acid 
molecule. "Heterologous" refers to a nucleic acid sequence that either originates from 
another species or is modified from either its original form or the form primarily expressed 
5 in the cell. 

"Transgene" means any piece of DNA which is inserted by artifice into a cell, 
and becomes part of the genome of the organism (i.e., either stably integrated or as a stable 
extrachromosomal element) which develops from that cell. Such a transgene may include a 
gene which is partly or entirely heterologous (i.e., foreign) to the transgenic organism, or 

1 0 may represent a gene homologous to an endogenous gene of the organism. Included within 
this definition is a transgene created by the providing of an RNA sequence which is 
transcribed into DNA and then incorporated into the genome. The transgenes of the 
invention include DNA sequences which encode which encodes the fluorescent protein 
which may be expressed in a transgenic non-human animal. The term "transgenic" as used 

1 5 herein additionally includes any organism whose genome has been altered by in vitro 

manipulation of the early embryo or fertilized egg or by any transgenic technology to induce 
a specific gene knockout. The term "gene knockout" as used herein, refers to the targeted 
disruption of a gene in vivo with complete loss of function that has been achieved by any 
transgenic technology familiar to those in the art. In one embodiment, transgenic animals 

2 0 having gene knockouts are those in which the target gene has been rendered nonfunctional 
by an insertion targeted to the gene to be rendered non-functional by homologous 
recombination. As used herein, the term "transgenic" includes any transgenic technology 
familiar to those in the art which can produce an organism carrying an introduced transgene 
or one in which an endogenous gene has been rendered non-functional or "knocked out." 

25 

IB. USES OF ENGINEERED FLUORESCENT PROTEINS 

The proteins of this invention are useful in any methods that employ 

fluorescent proteins. 

The engineered fluorescent proteins of this invention are useful as 
30 fluorescent markers in the many ways fluorescent markers already are used. This includes, 

for example, coupling engineered fluorescent proteins to antibodies, nucleic acids or other 

receptors for use in detection assays, such as immunoassays or hybridization assays. 
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The engineered fluorescent proteins of this invention are useful to track the 
movement of proteins in cells. In this embodiment, a nucleic acid molecule encoding the 
fluorescent protein is fused to a nucleic acid molecule encoding the protein of interest in an 
expression vector. Upon expression inside the cell, the protein of interest can be localized 
5 based on fluorescence. In another version, two proteins of interest are fused with two 
engineered fluorescent proteins having different fluorescent characteristics. 

The engineered fluorescent proteins of this invention are useful in systems to 
detect induction of transcription. In certain embodiments, a nucleotide sequence encoding 
the engineered fluorescent protein is fused to expression control sequences of interest and 

10 the expression vector is transfected into a cell Induction of the promoter can be measured 
by detecting the expression and/or quantity of fluorescence. Such constructs can be used 
used to follow signaling pathways from receptor to promoter. 

The engineered fluorescent proteins of this invention are useful in 
applications involving FRET. Such applications can detect events as a function of the 

1 5 movement of fluorescent donors and acceptor towards or away from each other. One or 

both of the donor/acceptor pair can be a fluorescent protein. A prefeired donor and receptor 
pair for FRET based assays is a donor with a T203I mutation and an acceptor with the 
mutation T203X, wherein X is an aromatic amino acid-39, especially T203Y, T203W, or 
T203H. In a particularly useful pair the donor contains the following mutations: S72A, 

20 K79R, Y145F, M153A and T203I (with a excitation peak of 395 nm and an emission peak 
of 51 1 nm) and the acceptor contains the following mutations S65G, S72A, K79R, and 
T203 Y. This particular pair provides a wide separation between the excitation and emission 
peaks of the donor and provides good overlap between the donor emission spectrum and the 
acceptor excitation spectrum. Other red-shifted mutants, such as those described herein, 

2 5 can also be used as the acceptor in such a pair. 

In one aspect, FRET is used to detect the cleavage of a substrate having the 
donor and acceptor coupled to the substrate on opposite sides of the cleavage site. Upon 
cleavage of the substrate, the donor/acceptor pair physically separate, eliminating FRET. 
Assays involve contacting the substrate with a sample, and determining a qualitative or 

30 quantitative change in FRET. In one embodiment, the engineered fluorescent protein is 
used in a substrate for □ -lactamase. Examples of such substrates are described in United 
States patent applications 08/407,544, filed March 20, 1995 and International Application 
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PCT/US96/04059, filed March 20, 1996. In another embodiment, an engineered fluorescent 
protein donor/acceptor pair are part of a fusion protein coupled by a peptide having a 
proteolytic cleavage site. Such tandem fluorescent proteins are described in United States 
patent application 08/594,575, filed January 31, 1996. 

In another aspect, FRET is used to detect changes in potential across a 
membrane. A donor and acceptor are placed on opposite sides of a membrane such that one 
translates across the membrane in response to a voltage change. This creates a measurable 
FRET. Such a method is described in United States patent application 08/481,977, filed 
June 7, 1995 and International Application PCT/US96/09652, filed June 6, 1996. 

The engineered protein of this invention are useful in the creation of 
fluorescent substrates for protein kinases. Such substrates incorporate an amino acid 
sequence recognizable by protein kinases. Upon phosphorylation, the engineered 
fluorescent protein undergoes a change in a fluorescent property. Such substrates are useful 
in detecting and measuring protein kinase activity in a sample of a cell, upon transfection 
and expression of the substrate. Preferably, the kinase recognition site is placed within 
about 20 amino acids of a terminus of the engineered fluorescent protein. The kinase 
recognition site also can be placed in a loop domain of the protein. (See, e.g. Figure IB.) 
Methods for making fluorescent substrates for protein kinases are described in United States 
patent application 08/680,877, filed July 16, 1996. 

A protease recognition site also can be introduced into a loop domain. Upon 
cleavage, fluorescent property changes in a measurable fashion. 
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The invention also includes a method of identifying a test chemical. Typically, the 
method includes contacting a test chemical a sample containing a biological entity labeled 
with a functional, engineered fluorescent protein or a polynucleotide encoding said 
functional, engineered fluorescent protein. By monitoring fluorescence (i.e. a fluorescent 
5 property) from the sample containing the functional engineered fluorescent protein it can be 
determined whether a test chemical is active. Controls can be included to insure the 
specificity of the signal. Such controls include measurements of a fluorescent property in 
the absence of the test chemical, in the presence of a chemical with an expected activity 
(e.g., a known modulator) or engineered controls (e.g., absence of engineered fluorescent 

10 protein, absence of engineered fluorescent protein polynucleotide or the absence of operably 
linkage of the engineered fluorescent protein). 

The fluorescence in the presence of a test chemical can be greater or less than in the 
absence of said test chemical. For instance if the engineered fluorescent protein is used a 
reporter of gene expression, the test chemical may up or down regulate gene expression. 

1 5 For such types of screening, the polynucleotide encoding the functional, engineered 

fluorescent protein is operatively linked to a genomic polynucleotide or a re. Alternatively, 
the functional, engineered fluorescent protein is fused to second functional protein. This 
embodiment can be used to track localization of the second protein or to track protein- 
protein interactions using energy transfer. 

20 

IV. PROCEDURES 

Fluorescence in a sample is measured using a fluorimeter. In general, 
excitation radiation from an excitation source having a first wavelength, passes through 
excitation optics. The excitation optics cause the excitation radiation to excite the sample. 

25 In response, fluorescent proteins in the sample emit radiation which has a wavelength that is 
different from the excitation wavelength. Collection optics then collect the emission from 
, the sample. The device can include a temperature controller to maintain the sample at a 
specific temperature while it is being scanned. According to one embodiment, a multi-axis 
translation stage moves a microtiter plate holding a plurality of samples in order to position 

3 0 different wells to be exposed. The multi-axis translation stage, temperature controller, auto- 
focusing feature, and electronics associated with imaging and data collection can be 
managed by an appropriately programmed digital computer. The computer also can 
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transform the data collected during the assay into another format for presentation. This 
process can be miniaturized and automated to enable screening many thousands of 
compounds. 

Methods of performing assays on fluorescent materials are well known in the 
5 art and are described in, e.g., Lakowicz, J.R., Principles of Fluorescence Spectroscopy, New 
York:Plenum Press (1983); Herman, B,, Resonance energy transfer microscopy, in: 
Fluorescence Microscopy of Living Cells in Culture, Part B, Methods in Cell Biology, vol 
30, ed. Taylor, D.L. & Wang, Y.-L., San Diego: Academic Press (1989), pp. 219-243; 
Turro, N.J., Modern Molecular Photochemistry, Menlo Park: Benjamin/Cummings 
10 Publishing Coi, Inc. (1978), pp. 296-361. 

The following examples are provided by way of illustration, not by way of 

limitation. 



15 EXAMPLES 

As a step in understanding the properties of GFP, and to aid in the tailoring 
of GFPs with altered characteristics, we have determined the three dimensional structure at 
1.9 A resolution of the S65T mutant (R. Heim et al. Nature 373:664-665 (1995)) of A. 
victoria GFP. This mutant also contains the ubiquitous Q80R substitution, which 

2 0 accidentally occurred in the early distribution of the GFP cDNA and is not known to have 

any effect on the protein properties (M. Chalfie et al. Science 263:802-805 (1994)). 

Histidine-tagged S65T GFP (R. Heim et al. Nature 373:664-665 (1995)) was 
overexpressed in JM109/pRSET B in 4 1 YT broth plus ampicillin at 37D, 450 rpm and 5 
1/min air flow. The temperature was reduced to 25 □ at Aj 95 = 0.3, followed by induction 
25 with ImM isopropylthiogalactoside for 5h. Cell paste was stored at -80D overnight, then 
was resuspended in 50 mM HEPES pH 7.9, 0.3 M NaCl, 5 mM 2-mercaptoethanol, 0.1 mM 
phenylmethyl-sulfonylfluoride (PMSF), passed once through a French press at 10,000 psi, 
then centrifuged at 20 K rpm for 45 min. The supernatant was applied to a Ni-NTA-agarose 
column (Qiagen), followed by a wash with 20 mM imidazole, then eluted with 100 mM 

3 0 imidazole. Green fractions were pooled and subjected to chymotryptic (Sigma) proteolysis 

(1:50 w/w) for 22 h at RT. After addition of 0.5 mM PMSF, the digest was reapplied to the 
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Ni column. N-terminal sequencing verified the presence of the correct N-terminal 
methionine. After dialysis against 20 mM HEPES, pH 7.5 and concentration to A 490 = 20, 
rod-shaped crystals were obtained at RT in hanging drops containing 5 Dl protein and 5 DI 
well solution, 22-26% PEG 4000 (Serva), 50 mM HEPES pH 8.0-8.5, 50 mM MgCl 2 and 10 
5 mM 2-mercapto-ethanol within 5 days. Crystals were 0.05 mm across and up to 1.0 mm 
long. The space group is P2,2,2 ! with a * 5 1.8, b = 62.8, c = 70.7 A, Z=4. Two crystal 
forms of wild-type GFP, unrelated to the present form, have been described by M. A. 
Perrozo, K. B. Ward, R. B. Thompson, & W. W. Ward. 1 Biol Chem. 203, 7713-7716 
(1988). 

1 0 The structure of GFP was determined by multiple isomoiphous replacement 

and anomalous scattering (Table E), solvent flattening, phase combination and 
crystallographic refinement. The most remarkable feature of the fold of GFP is an eleven 
stranded B-barrel wrapped around a single central helix (Fig. 1 A and IB), where each strand 
consists of approximately 9-13 residues. The barrel forms a nearly perfect cylinder 42 A 

1 5 long and 24 A in diameter. The N-terminal half of the polypeptide comprises three anti- 
parallel strands, the central helix, and then 3 more anti-parallel strands, the latter of which 
(residues 1 18-123) is parallel to the N-terminal strand (residues 1 1-23). The polypeptide 
backbone then crosses the "bottom" of the molecule to form the second half of the barrel in 
a five-strand Greek Key motif. The top end of the cylinder is capped by three short, 

20 distorted helical segments, while one short, very distorted helical segment caps the bottom 
of the cylinder. The main-chain hydrogen bonding lacing the surface of the cylinder very 
likely accounts for the unusual stability of the protein towards denaturation and proteolysis. 
There are no large segments of the polypeptide that could be excised while preserving the 
intactness of the shell around the chromophore. Thus it would seem difficult to re-engineer 

2 5 GFP to reduce its molecular weight (J. Dopf & T.M. Horiagon Gene 1 73 :39-43 (1 996)) by a 

large percentage. 

The p-hydroxybenzylideneimidazolidinone chromophore (C. W. Cody et al. 
Biochemistry 32:1212-1218 (1993)) is completely protected from bulk solvent and centrally 
located in the molecule. The total and presumably rigid encapsulation is probably 

3 0 responsible for the small Stokes' shift (i.e. wavelength difference between excitation and 

emission maxima), high quantum yield of fluorescence, inability of 0 2 to quench the excited 
state (B.D. Nageswara Rao et al. Biophys. 1 32:630-632 (1980)), and resistance of the 
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chromophore to titration of the external pH (W. W. Ward. Bioluminescence and 
Chemiluminescence (M A. DeLuca and W. D. McElroy, eds) Academic Press pp. 235-242 
(1981); W. W. Ward & S. H. Bokman. Biochemistry 21:4535-4540 (1982); W. W. Ward et 
al. Photochem. Photobioi 35:803-808 (1982)). It also allows one to rationalize why 
fluorophore formation should be a spontaneous intramolecular process (R. Heim et al. Proc. 
Natl. Acad. Set. USA 91:12501-12504 (1994)), as it is difficult to imagine how an enzyme 
could gain access to the substrate. The plane of the chromophore is roughly perpendicular 
(60 0) to the symmetry axis of the surrounding barrel. One side of the chromophore faces a 
surprisingly large cavity, that occupies a volume of approximately 135 A 3 (B. Lee & F. M. 
Richards. J. MoL Biol. 55:379-400 (1971)). The atomic radii were those of Lee & Richards, 
calculated using the program MS with a probe radius of 1.4 A. (M. L. Connolly, Science 
221 :709-713 (1983)), The cavity does not open out to bulk solvent. Four water molecules 
are located in the cavity, forming a chain of hydrogen bonds linking the buried side chains 
of Glu 2 ~ and Gin 69 . Unless occupied, such a large cavity would be expected to de-stabilize 
the protein by several kcal/mol (S. J. Hubbard et al., Protein Engineering 7:613-626 (1994); 
A. E. Eriksson et al. Science 255 :178-183 (1992)). Part of the volume of the cavity might 
be the consequence of the compaction resulting from cyclization and dehydration reactions. 
The cavity might also temporarily accommodate the oxidant, most likely 0 2 (A. B. Cubitt 
et al. Trends Biochem. ScL 20:448-455 (1995); R. Heim et al Proc. Natl. Acad. ScL USA 
91:12501-12504 (1994); S. Inouye & F.I. Tsuji. FEBSLett. 351:21 1-214 (1994)), that 
dehydrogenates the C-D bond of Tyr 66 . The chromophore, cavity, and side chains that 
contact the chromophore are shown in Figure 2 A and a portion of the final electron density 
map in this vicinity in 2B. 

The opposite side of the chromophore is packed against several aromatic and 
polar side chains. Of particular interest is the intricate network of polar interactions with the 
chromophore (Fig. 2C). His 148 , Thr 203 and Ser 205 form hydrogen bonds with the phenolic 
hydroxyl; Arg 96 and Gin 94 interact with the carbonyl of the imidazolidinone ring and Glu 222 
forms a hydrogen bond with the side chain of Thr 65 . Additional polar interactions, such as 
hydrogen bonds to Arg 96 from the carbonyl of Thr 62 , and the side-chain carbonyl of Gin 183 , 
presumably stabilize the buried Arg 96 in its protonated form. In turn, this buried charge 
suggests that a partial negative charge resides on the carbonyl oxygen of the 
imidazolidinone ring of the deprotonated fluorophore, as has previously been suggested (W. 
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W. Ward. Bioluminescence and Chemiluminescence (M. A. DeLuca and W. D. McElroy, 
eds) Academic Press pp. 235-242 (1981); W. W. Ward & S. H. Bokman. Biochemistry 
21:4535-4540 (1982); W. W. Ward et al. Photochem. Photobiol 35:803-808 (1982)). Arg* 
is likely to be essential for the formation of the fluorophore, and may help catalyze the 
5 initial ring closure. Finally, Tyr 145 shows a typical stabilizing edge-face interaction with the 
benzyl ring. Trp 57 , the only tryptophan in GFP, is located 13 A to 15 A from the 
chromophore and the long axes of the two ring systems are nearly parallel. This indicates 
that efficient energy transfer to the latter should occur, and explains why no separate 
tryptophan emission is observable (D.C. Prasher et al. Gene 1 1 1:229-233 (1992). The two 

10 cysteines in GFP, Cys 48 and Cys 70 , are 24 A apart, too distant to form a disulfide bridge. 
Cys 70 is buried, but Cys 48 should be relatively accessible to sulfhydryl-specific reagents. 
Such a reagent, 5,5'-dithiobis(2-nitrobenzoic acid), is reported to label GFP and quench its 
fluorescence (S. Inouye & F.L Tsuji FEBSLett. 351:211-214(1994)). This effect was 
attributed to the necessity for a free sulfliydryl, but could also reflect specific quenching by 

1 5 the 5-thio-2-nitrobenzoate moiety that would be attached to Cys 48 . 

Although the electron density map is for the most part consistent with the 
proposed structure of the chromophore (D.C. Prasher et al. Gene 1 11:229-233 (1992); C. W. 
Cody et ah Biochemistry 32:1212-1218 (1993)) in the cis [Z-] configuration, with no 
evidence for any substantial fraction of the opposite isomer around the chromophore double 

2 0 bond, difference features are found at >4 □ in the final (F 0 -F c ) electron density map that can 
be interpreted to represent either the intact, vmcyclized polypeptide or a carbinolamine (inset 
to Fig. 2C). This suggests that a significant fraction, perhaps as much as 30% of the 
molecules in the crystal, have failed to undergo the final dehydration reaction. 
Confirmation of incomplete dehydration comes from electrospray mass spectrometry, which 

2 5 consistently shows that the average masses of both wild-type and S65T GFP (3 1 ,086±4 and 
31,099.5±4 Da, respectively) are 6-7 Da higher than predicted (31,079 and 31,093 Da, 
respectively) for the fully matured proteins. Such a discrepancy could be explained by a 30- 
35% mole fraction of apoprotein or carbinolamine with 18 or 20 Da higher molecular 
weight The natural abundance of ,3 C and *H and the finite resolution of the Hewlett-Packard 

30 5989B electrospray mass spectrometer used to make these measurements do not permit the 
individual peaks to be resolved, but instead yields an average mass peak with a fiill width at 
half maximum of approximately 15 Da. The molecular weights shown include the His-tag, 
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which has the sequence MRGSHHHHHH GMASMTGGQQM GRDLYDDDDK DPPAEF 
(SEQ ID NO:5). Mutants of GFP that increase the efficiency of fluorophore maturation 
might yield somewhat brighter preparations. In a model for the apoprotein, the Thr^-Tyr 66 
peptide bond is approximately in the □ -helical conformation, while the peptide of Tyr 66 - 
Gly 67 appears to be tipped almost perpendicular to the helix axis by its interaction with 
Arg 96 . This further supports the speculation that Arg 96 is important in generating the 
conformation required for cyclization, and possibly also for promoting the attack of Gly 67 on 
the carbonyl carbon of Thr 6S (A. B. Cubitt et al. Trends Biochem. Sci. 20:448-455 (1995)). 

The results of previous random mutagenesis have implicated several amino 
acid side chains to have substantial effects on the spectra and the atomic model confirms 
that these residues are close to the chromophore. The mutations T203I and E222G have 
profound but opposite consequences on the absorption spectrum (T. Ehrig et al. FEBS 
Letters 367:163-166 (1995)). T203I (with wild-type Ser 65 ) lacks the 475 nm absorbance 
peak usually attributed to the anionic chromophore and shows only the 395 nm peak 
thought to reflect the neutral chromophore (R. Heim et al Proc. Natl Acad. Sci. USA 
91:12501-12504 (1994); T. Ehrig et al. FEBS Letters 367:163-166 (1995)). Indeed, Thr 203 is 
hydrogen-bonded to the phenolic oxygen of the chromophore, so replacement by He should 
hinder ionization of the phenolic oxygen. Mutation of Glu 222 to Gly (T. Ehrig et al. FEBS 
Letters 367:163-166 (1995)) has much the same spectroscopic effect as replacing Ser 65 by 
Gly, Ala, Cys, Val, or Thr, namely to suppress the 395 nm peak in favor of a peak at 470- 
490 nm (R. Heim et al. Nature 373:664-665 (1995); S. Delagrave et al. Bio/Technology 
13:151-154 (1995)). Indeed Glu 222 and the remnant of Thr 65 are hydrogen-bonded to each 
other in the present structure, probably with the uncharged carboxyl of Glu 222 acting as 
donor to the side chain oxygen of Thr 65 . Mutations E222G, S65G, S65A, and S65V would 
all suppress such H-bonding. To explain why only wild-type protein has both excitation 
peaks, Ser 65 , unlike Thr 65 , may adopt a conformation in which its hydroxyl donates a 
hydrogen bond to and stabilizes Glu 222 as an anion, whose charge then inhibits ionization of 
the chromophore. The structure also explains why some mutations seem neutral. For 
example, Gin 80 is a surface residue far removed from the chromophore, which explains why 
its accidental and ubiquitous mutation to Arg seems to have no obvious intramolecular 
spectroscopic effect (M. Chalfie et al. Science 263:802-805 (1994)). 

The development of GFP mutants with red-shifted excitation and emission 
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maxima is an interesting challenge in protein engineering (A. B. Cubitt et al. Trends 
Biochem. Scl 20:448-455 (1995); R. Heim et al. Nature 373:664-665 (1995); S. Delagrave 
et al. Bio/Technology 13:15 1-154 (1995)). Such mutants would also be valuable for 
avoidance of cellular autofluorescence at short wavelengths, for simultaneous multicolor 
5 reporting of the activity of two or more cellular processes, and for exploitation of 

fluorescence resonance energy transfer as a signal of protein-protein interaction (R. Heim & 
R.Y. Tsien. Current Biol 6:178-182 (1996)). Extensive attempts using random 
mutagenesis have shifted the emission maximum by at most 6 nm to longer wavelengths, to 
514 nm (R. Heim & R.Y. Tsien. Current Biol 6:178-182 (1996)); previously described 

10 "red-shifted" mutants merely suppressed the 395 nm excitation peak in favor of the 475 nm 
peak without any significant reddening of the 505 nm emission (S. Delagrave et al. 
Bio/Technology 13:151-154(1995)). Because Thr 203 is revealed to be adjacent to the 
phenolic end of the chromophore, we mutated it to polar aromatic residues such as His, Tyr, 
and Trp in the hope that the additional polarizability of their □ systems would lower the 

1 5 energy of the excited state of the adjacent chromophore. All three substitutions did indeed 
shift the emission peak to greater than 520 nm (Table F). A particularly attractive mutation 
was T203Y/S65G/V68L/S72A, with excitation and emission peaks at 513 and 527 nm 
respectively. These wavelengths are sufficiently different from previous GFP mutants to be 
readily distinguishable by appropriate filter sets on a fluorescence microscope. The 

2 0 extinction coefficient, 36,500 M'W 1 , and quantum yield, 0.63, are almost as high as those 
of S65T (R. Heim et al. Nature 373:664-665 (1995)). 

Comparison oiAequorea GFP with other protein pigments is instructive. 
Unfortunately, its closest characterized homolog, the GFP from the sea pansy Renilla 
reniformis (O. Shimomura and F.H. Johnson J. Cell Comp. Physiol 59:223 (1962); J. G. 

25 Morin and J. W. Hastings,,/. Cell Physiol 77:313 (1971); H. Morise et al. Biochemistry 

13:2656 (1974); W. W. Ward Photochem. Photobiol Reviews (Smith, K. C. ed.) 4:1 (1979); 
W. W. Ward. Bioluminescence and Chemiluminescence (M. A. DeLuca and W. D. 
McElroy, eds) Academic Press pp. 235-242 (1981); W. W. Ward & S. H. Bokman 
Biochemistry 21:4535-4540 (1982); W. W. Ward et al. Photochem. Photobiol 35:803-808 

30 (1982)), has not been sequenced or cloned, though its chromophore is derived from the 
same FSYG sequence as in wild-type Aequorea GFP (R. M. San Pietro et al. Photochem. 
Photobiol 57:63S (1993)). The closest analog for which a three dimensional structure is 
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available is the photoactive yellow protein (PYP, G. E. 0. Borgstahl et al. Biochemistry 
34:6278-6287 (1995)), a 14-kDa photoreceptor from halophilic bacteria. PYP in its native 
dark state absorbs maximally at 446 nm and transduces light with a quantum yield of 0.64, 
rather closely matching wild-type GFP's long wavelength absorbance maximum near 475 
5 nm and fluorescence quantum yield of 0.72-0.85. The fundamental chromophore in both 
proteins is an anionic /?-hydroxycinnamyl group, which is covalently attached to the protein 
via a thioester linkage in PYP and a heterocyclic iminolactam in GFP. Both proteins 
stabilize the negative charge on the chromophore with the help of buried cationic arginine 
and neutral glutamic acid groups, Arg S2 and Glu 46 in PYP and Arg 96 and Glu 222 in GFP, 

10 though in PYP the residues are close to the oxyphenyl ring whereas in GFP they are nearer 
the carbonyl end of the chromophore. However, PYP has an overall □/□ fold with 
appropriate flexibility and signal transduction domains to enable it to mediate the cellular 
phototactic response, whereas GFP is a much more regular and rigid □-barrel to minimize 
parasitic dissipation of the excited state energy as thermal or conformational motions. GFP 

15 is an elegant example of how a visually appealing and extremely useful function, efficient 
fluorescence, can be spontaneously generated from a cohesive and economical protein 
structure. 

A. Summary Of GFP Structure Determination 

2 0 Data were collected at room temperature in house using either Molecular 

Structure Corp. R-axis II or San Diego Multiwire Systems (SDMS) detectors (Cu KD) and 
later at beamline X4A at the Brookhaven National Laboratory at the selenium absorption 
edge (□ = 0.979 A) using image plates. Data were evaluated using the HKL package (Z. 
Otwinowski, in Proceedings of the CCP4 Study Weekend: Data Collection and Processings 

25 L. Sawyer, N. Issacs, S. Bailey, Eds. (Science and Engineering Research Council (SERC), 
Daresbury Laboratory, Warrington, UK, (1991)), pp 56-62; W. Minor, XDISPLAYF 
(Purdue University, West Lafayette, IN, 1993)) or the SDMS software (A. J. Howard et al. 
Meth. Enzymol. 1 14:452-471 (1985)). Each data set was collected from a single crystal. 
Heavy atom soaks were 2 mM in mother liquor for 2 days. Initial electron density maps 

30 were based on three heavy atom derivatives using in-house data, then later were replaced 
with the synchrotron data. The EMTS difference Patterson map was solved by inspection, 
then used to calculate difference Fourier maps of the other derivatives. Lack of closure 
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refinement of the heavy atom parameters was performed using the Protein package (W. 
Steigemann, in Ph.D. Thesis (Technical University, Munich, 1974)). The MIR maps were 
much poorer than the overall figure of merit would suggest, and it was clear that the EMTS 
isomorphous differences dominated the phasing. The enhanced anomalous occupancy for 
5 the synchrotron data provided a partial solution to the problem. Note that the phasing power 
was reduced for the synchrotron data, but the figure of merit was unchanged. All 
experimental electron density maps were improved by solvent flattening using the program 
DM of the CCP4 (CCP4: A Suite of Programs for Protein Crystallography (SERC 
Daresbury Laboratory, Warrington WA4 4AD UK, 1979)) package assuming a solvent 

1 0 content of 3 8%. Phase combination was performed with PHASC02 of the Protein package 
using a weight of 1 .0 on the atomic model. Heavy atom parameters were subsequently 
improved by refinement against combined phases. Model building proceeded with FRODO 
and O (T. A. Jones et al. Acta. Crystallogr. Sect. A 47:110(1991); T. A. Jones, in 
Computational Crystallography D. Sayre, Ed (Oxford University Press, Oxford, 1982) pp. 

1 5 303-3 1 7) and crystallographic refinement was performed with the TNT package (D. E. 
Tronrud et al. Acta CrysU A 43:489-503 (1987)). Bond lengths and angles for the 
chromophore were estimated using CHEM3D (Cambridge Scientific Computing). Final 
refinement and model building was performed against the X4A selenomethione data set, 
using (2F 0 -F C ) electron density maps. The data beyond 1.9 A resolution have not been used 

20 at this stage. The final model contains residues 2-229 as the terminal residues are not 
visible in the electron density map, and the side chains of several disordered surface 
residues have been omitted. Density is weak for residues 156-158 and coordinates for these 
residues are unreliable. This disordering is consistent with previous analyses showing that 
residues 1 and 233-238 are dispensible but that further truncations may prevent fluorescence 

25 (J. Dopf & T.M. Horiagon. Gene 173:39-43 (1996)). The atomic model has been deposited 
in the Protein Data Bank (access code 1EMA). 
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Atomic Model Statistics 

Protein atoms 1790 

Solvent atoms 94 

Resol. range (A) 20-1.9 

Number of reflections (F > 0) 17676 

Completeness 84. 

R. factor 00 0.175 

Mean B- value (A 2 ) 24.1 

Deviations from ideality 

Bond lengths (A) 0.014 

Bond angles (□) 1.9 

Restrained B- values (A 2 ) 4.3 

Ramachandran outliers 0 



Notes: 
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(a) Completeness is the ratio of observed reflections to theoretically possible expressed 
as a percentage. 

(b) Shell indicates the highest resolution shell, typically 0. 1-0.4 A wide. 

(c) Rmerge = □ |I - <I>| / □ I, where <I> is the mean of individual observations of 
5 intensities I. 

(d) Riso = D |I DER - I NAT | / □ 

(e) Derivatives were EMTS=ethymercurithiosalicylate (residues modified Cys 48 and 
Cys 70 ), SeMet=selenomethionine substituted protein (Met 1 and Met 233 could not be 
located); HgI 4 -SeMet = double derivative Hgl 4 on SeMet background 

10 (f) Phasing power = <F H >/<E> where <F H >=r.m.s. heavy atom scattering and <E>=lack 
of closure. 

(g) FOM, mean figure of merit 

(h) Standard crystallographic R-factor, R = □ ||F J - PvJI / □ |F J 



15 B. Spectral properties of Thr 03 ("T203") mutants compared to S65T 

The mutations F64L, V68L and S72A improve the folding of GFP at 37 □ 
(B. P. Cormack et al. Gene 173:33 (1996)) but do not significantly shift the emission 
spectra. 

TABLE F 

Clone Mutations Excitation Extinction Emission 

max.(nm) coefficient max.(nm) 
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513 


30.8 


525 


10C 


T203Y/F65G/V68L/S72A 


513 


36.5 


527 


11 


T203W/S65G/S72A 


502 


33.0 


512 
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12H T203Y/S65G/S72A 513 36.5 527 

20 A T203Y/S65G/V68L/Q69K/S72A 515 46.0 527 

The present invention provides novel long wavelength engineered 
fluorescent proteins. While specific examples have been provided, the above description is 
illustrative and not restrictive. Many variations of the invention will become apparent to 
5 those skilled in the an upon review of this specification. The scope of the invention should, 
therefore, be determined not with reference to the above description, but instead should be 
determined with reference to the appended claims along with their full scope of equivalents. 

All publications and patent documents cited in this application are 
1 0 incorporated by reference in their entirety for all purposes to the same extent as if each 
individual publication or patent document were so individually denoted. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: The Regents of the University of California et al . 

(ii) TITLE OF INVENTION: LONG WAVELENGTH MUTANT FLUORESCENT 
PROTEINS 

(iii) NUMBER OF SEQUENCES: 4 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Fish & Richardson P-C. 

(B) STREET: 4225 Executive Square, Suite 1400 

(C) CITY: La Jolla 

(D) STATE: CA 

(E) COUNTRY: USA 

(F) ZIP: 92037 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE : Patentln Release #1.0, Version #1.30 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/024,050 

(B) FILING DATE: 16-AUG-1996 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/706,408 

(B) FILING DATE: 30 -AUG- 1996 

(C) CLASSIFICATION: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Haile, Lisa A. 

(B) REGISTRATION NUMBER: 38,347 

(C) REFERENCE / DOCKET NUMBER: 07257/ 056WO1 

(ix) TELECOMMUNICATION INFORMATION : 

(A) TELEPHONE: 619/678-5070 

(B) TELEFAX: 619/678-5099 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 716 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..714 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ATG AGT AAA GGA GAA GAA CTT TTC ACT GCA GTT GTC CCA ATT CTT GTT 48 
Met Ser Lys Gly Glu Glu Leu Phe Thr Ala Val Val Pro lie Leu Val 
15 10 15 

GAA TTA GAT GGT GAT GTT AAT GGG CAC AAA TTT TCT GTC AGT GGA GAG 96 
Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu 
20 25 30 

GGT GAA GGT GAT GTA ACA TAC GGA AAA CTT ACC CTT AAA TTT ATT TGC 144 
Gly Glu Gly Asp Val Thr Tyr Gly Lys Leu Thr Leu Lys Phe lie Cys 
35 40 45 

ACT ACT GGA AAA CTA CCT GTT CCA TGG CCA ACA CTT GTC ACT ACT TTC 192 
Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe 
50 55 60 

TCT TAT GGT GTT CAA TGC TTT TCA AGA TAC CCA GAT CAT ATG AAA CGG 240 
Ser Tyr Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys Arg 
65 " 70 75 80 

CAT GAC TTT TTC AAG AGT GCC ATG CCC GAA GGT TAT GTA CAG CAA AGA 288 
His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Gin Arg 
85 90 95 

ACT ATA TTT TTC AAA GAT GAC GGG AAC TAC AAG ACA CGT GCT GAA GTC 336 
Thr lie Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val 
100 105 110 

AAG TTT GAA GGT GAT ACC CTT GTT AAT AGA ATC GAG TTA AAA GGT ATT 384 
Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly He 
115 120 125 

GAT TTT AAA GAA GAT GGA AAC ATT CTT GGA CAT AAA TTG GAA TAC AAC 432 
Asp Phe Lys Glu Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr Asn 
130 135 140 

TAT AAC TCA CAC AAT GTA TAC ATC ATG GCA GAC AAA CAA AAG AAT GGA 480 
Tyr Asn Ser His Asn Val Tyr He Met Ala Asp Lys Gin Lys Asn Gly 
145 150 155 160 

ATC AAA GTT AAC TTC AAA ATT AGA CAC AAC ATT GAA GAT GGA AGC GTT 528 
He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser Val 
165 170 175 

CAA CTA GCA GAC TAT TAT CAA CAA AAT ACT CCA ATT CTC GAT GGC CCT 576 
Gin Leu Ala Asp Tyr Tyr Gin Gin Asn Thr Pro He Leu Asp Gly Pro 
180 185 190 

GTC CTT TTA CCA GAC AAC CAT TAC CTG TCC ACA CAA TCT GCC CTT TCG 624 
Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gin Ser Ala Leu Ser 
195 200 205 

AAA GAT CCC AAC GAA AAG AGA GAC CAC ATG GTC CTT CTT GAG TTT GTA 672 
Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val 
210 215 220 

ACA GCT GCT GGG ATT ACA CAT GGC ATG GAT GAA CTA TAC AAA 714 

Thr Ala Ala Gly He Thr His Gly Met Asp Glu Leu Tyr Lys 
225 230 235 

TA 716 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 238 amino acids 

(B) TYPE: amino acid 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ser Lvs Gly Glu Glu Leu Phe Thr Ala Val Val Pro He Leu Val 
1*5 10 15 

Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu 
20 25 30 

Gly Glu Gly Asp Val Thr Tyr Gly Lys Leu Thr Leu Lys Phe He Cys 
35 40 45 

Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe 
50 * 55 60 

Ser Tyr Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys Arg 
65 " 70 75 80 

His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Gin Arg 
85 90 95 

Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val 
100 105 110 

Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly He 
115 * 120 125 

Asp Phe Lys Glu Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr Asn 
130 135 140 

Tyr Asn Ser His Asn Val Tyr He Met Ala Asp Lys Gin Lys Asn Gly 
145 150 155 160 

He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser Val 
165 170 175 

Gin Leu Ala Asp Tyr Tyr Gin Gin Asn Thr Pro He Leu Asp Gly Pro 
180 185 190 

Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gin Ser Ala Leu Ser 
195 * 200 205 

Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val 
210 215 220 

Thr Ala Ala Gly He Thr His Gly Met Asp Glu Leu Tyr Lys 
225 230 235 

(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 720 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



( ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..720 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATG GTG AGC AAG GGC GAG GAG CTG TTC ACC GGG GTG GTG CCC ATC CTG 48 
Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro He Leu 
240 245 250 

GTC GAG CTG GAC GGC GAC GTA AAC GGC CAC AAG TTC AGC GTG TCC GGC 96 
Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly 
255 260 265 270 

GAG GGC GAG GGC GAT GCC ACC TAC GGC AAG CTG ACC CTG AAG TTC ATC 144 
Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He 
275 280 285 

TGC ACC ACC GGC AAG CTG CCC GTG CCC TGG CCC ACC CTC GTG ACC ACC 192 
Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr 
290 295 300 

TTC GGC TAC GGC GTG CAG TGC TTC GCC CGC TAC CCC GAC CAC ATG AAG 240 
Phe Gly Tyr Gly Val Gin Cys Phe Ala Arg Tyr Pro Asp His Met Lys 
305 310 315 

CAG CAG GAC TTC TTC AAG TCC GCC ATG CCC GAA GGC TAC GTC CAG GAG 288 
Gin Gin Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu 
320 325 330 

CGC ACC ATC TTC TTC AAG GAC GAC GGC AAC TAC AAG ACC CGC GCC GAG 336 
Arg Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu 
335 340 345 350 

GTG AAG TTC GAG GGC GAC ACC CTG GTG AAC CGC ATC GAG CTG AAG GGC 384 
Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly 
355 360 365 

ATC GAC TTC AAG GAC GAC GGC AAC ATC CTG GGG CAC AAG CTG GAG TAC 432 
He Asp Phe Lys Asp Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr 
370 1 375 380 

AAC TAC AAC AGC CAC AAC GTC TAT ATC ATG GCC GAC AAG CAG AAG AAC 480 
Asn Tyr Asn Ser His Asn Val Tyr He Met Ala Asp Lys Gin Lys Asn 
385 390 395 

GGC ATC AAG GTG AAC TTC AAG ATC CGC CAC AAC ATC GAG GAC GGC AGC 528 
Gly He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser 
400 " 405 410 

GTG CAG CCC GCC GAC CAC TAC CAG CAG AAC ACC CCC ATC GGC GAC GGC 576 
Val Gin Pro Ala Asp His Tyr Gin Gin Asn Thr Pro He Gly Asp Gly 
415 420 425 430 

CCC GTG CTG CTG CCC GAC AAC CAC TAC CTG AGC TAC CAG TCC GCC CTG 624 
Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Tyr Gin Ser Ala Leu 
435 440 445 

AGC AAA GAC CCC AAC GAG AAG CGC GAT CAC ATG GTC CTG CTG GAG TTC 672 
Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe 
450 455 460 

GTG ACC GCC GCC GGG ATC ACT CAC GGC ATG GAC GAG CTG TAC AAG TAA 720 
Val Thr Ala Ala Gly He Thr His Gly Met Asp Glu Leu Tyr Lys * 
465 470 475 



(2) INFORMATION FOR SEQ ID NO:4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 240 amino acids 
<B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro He Leu 
15 10 15 

Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly 
20 * 25 30 

Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He 
35 40 45 

Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr 
50 55 60 

Phe Gly Tyr Gly Val Gin Cys Phe Ala Arg Tyr Pro Asp His Met Lys 
65 70 75 80 

Gin Gin Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu 
65 90 95 

Arg Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu 
100 105 110 

Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu Lys Gly 
115 120 125 

He Asp Phe Lys Asp Asp Gly Asn lie Leu Gly His Lys Leu Glu Tyr 
130 135 140 

Asn Tyr Asn Ser His Asn Val Tyr lie Met Ala Asp Lys Gin Lys Asn 
145 150 155 160 

Gly He Lys Val Asn Phe Lys He Arg His Asn He Glu Asp Gly Ser 
165 170 175 

Val Gin Pro Ala Asp His Tyr Gin Gin Asn Thr Pro He Gly Asp Gly 
180 185 190 

Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Tyr Gin Ser Ala Leu 
195 200 205 

Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe 
210 215 220 

Val Thr Ala Ala Gly lie Thr His Gly Met Asp Glu Leu Tyr Lys * 
225 230 235 240 
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WHAT IS CLAIMED IS : 

1 

2 1 . A nucleic acid molecule comprising a nucleotide sequence encoding 

3 a functional engineered fluorescent protein whose amino acid sequence is substantially 

4 identical to the amino acid sequence of Aequorea green fluorescent protein (SEQ ID NO:2) 

5 and which differs from SEQ ID NO:2 by at least the substitution T203X, wherein X is an 

6 aromatic amino acid selected from H, Y, W or F, said functional engineered fluorescent 

7 protein having a different fluorescent property than Aequorea green fluorescent protein. 

8 

1 2. The nucleic acid molecule of claim 1 wherein the amino acid 

2 sequence further comprises a substitution at S65, wherein the substitution is selected from 

3 S65G, S65T, S65A, S65L, S65C, S65V and S65I. 

1 

1 3. The nucleic acid molecule of claim 1 wherein the amino acid 

2 sequence differs by no more than the substitutions S65T/T203H; S65T/T203 Y; 

3 S72A/F64L/S65GA*203Y; S72A/S65G/V68L/T203Y; S65G/V68L/Q69K/S72A/T203Y; 

4 S65G/S72A/T203Y; or S65G/S72A/T203W. 

1 4. The nucleic acid molecule of claim 1 or 2 wherein the amino acid 

2 sequence further comprises a substitution at Y66, wherein the substitution is selected from 

3 Y66H,Y66F,andY66W. 

1 5. The nucleic acid molecule of claim 1 or 2 wherein the amino acid 

2 sequence further comprises a mutation from Table A. 



1 
2 



6. The nucleic acid molecule of claim 1 or 2 wherein the amino acid 
sequence further comprises a folding mutation. 
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1 7. The nucleic acid molecule of any of claims 1 -3 wherein the 

2 nucleotide sequence encoding the protein differs from the nucleotide sequence of SEQ ID 

3 NO: 1 by the substitution of at least one codon by a preferred mammalian codon. 

1 8. The nucleic acid molecule of any of claims 1-3 encoding a fusion 

2 protein wherein the fusion protein comprises a polypeptide of interest and the functional 

3 engineered fluorescent protein. 

X 9. An expression vector comprising expression control sequences 

2 operatively linked to a nucleic acid molecule comprising a nucleotide sequence encoding a 

3 functional engineered fluorescent protein whose amino acid sequence is substantially 

A identical to the amino acid sequence of Aequorea green fluorescent protein (SEQ ID NO:2) 

5 and which differs from SEQ ID NO:2 by at least the amino acid substitution T203X, 

6 wherein X is an aromatic amino acid selected from H, Y, W or F, said functional engineered 

7 fluorescent protein having a different fluorescent property than Aequorea green fluorescent 

8 protein. 

1 1 0. The expression vector of claim 9 wherein the amino acid sequence 

2 further comprises a substitution at S65, wherein the substitution is selected from S65G, 

3 S65T, S65A, S65L, S65C, S65V and S65I. 

1 11. The expression vector of claim 9 wherein the amino acid sequence 

2 differs by no more than the substitutions S65T/T203H; S65T/T203Y; 

3 S72A/F64US65G/T203Y; S72A/S65G/V68L/T203Y; S65G/V68L/Q69K/S72A/T203Y, 
A S65G/S72A/T203Y; or S65G/S72A/T203W. 

1 12. The expression vector of claim 10 or 1 1 wherein the amino acid 

2 sequence further comprises a substitution at Y66, wherein the substitution is selected from 

3 Y66H, Y66F, and Y66W. 



WO 98/06737 59 PCT/US97/14593 

1 13. The expression vector of claim 10 or 1 1 wherein the amino acid 

2 sequence further comprises a mutation from Table A. 

3 14. The expression vector of claim 9 or 1 0 wherein the amino acid 

4 sequence further comprises a folding mutation. 

1 15. The expression vector of any of claims 9-1 1 wherein the nucleotide 

2 sequence encoding the protein differs from the nucleotide sequence of SEQ ED NO:l by the 

3 substitution of at least one codon by a preferred mammalian codon. 

1 1 6. The expression vector of any of claims 9-1 1 encoding a fusion 

2 protein wherein the fusion protein comprises a polypeptide of interest and the functional 

3 engineered fluorescent protein. 

1 17. A recombinant host cell comprising an expression vector that 

2 comprises expression control sequences operatively linked to a nucleic acid molecule 

3 comprising a nucleotide sequence encoding a functional engineered fluorescent protein 

4 whose amino acid sequence is substantially identical to the amino acid sequence of 

5 Aequorea green fluorescent protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 

6 by at least the amino acid substitution T203X, wherein X is an aromatic amino acid selected 

7 from H, Y, W or F, said functional engineered fluorescent protein having a different 

8 fluorescent property than Aequorea green fluorescent protein. 



1 
2 
3 



18. The recombinant host cell of claim 17 wherein the amino acid 
sequence further comprises a substitution at S65, wherein the substitution is selected from 
S65G, S65T, S65A, S65L, S65C, S65V and S65L 
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1 19. The recombinant host cell of claim 17 wherein the amino acid 

2 sequence differs by no more than the substitutions S65T/T203H; S65T/T203Y; 

3 S72A/F64L/S65G/T203Y; S72A/S65G/V68L/T203Y; S65G/V68UQ69K/S72A/T203Y; 

4 S65G/S72A/T203Y; or S65G/S72A/T203W. 

1 20. The recombinant host cell of claim 17 or 1 8 wherein the amino acid 

2 sequence further comprises a substitution at Y66, wherein the substitution is selected from 

3 Y66H,Y66F,andY66W. 

1 21 . The recombinant host cell of claim 17 or 18 wherein the amino acid 

2 sequence further comprises a mutation from Table A. 

1 22. The recombinant host cell of claim 1 7 or 1 8 wherein the amino acid 

2 sequence further comprises a folding mutation. 

1 23. The recombinant host cell of any of claims 17-19 wherein the 

2 nucleotide sequence encoding the protein differs from the nucleotide sequence of SEQ ID 

3 NO:l by the substitution of at least one codon by a preferred mammalian codon. 

1 24. The recombinant host cell of any of claims 17-19 encoding a fusion 

2 protein wherein the fusion protein comprises a polypeptide of interest and the functional 

3 engineered fluorescent protein. 

1 25. The recombinant host cell of any of claims 17-19 which is a 

2 prokaryotic cell. 



1 

2 



26. 

eukaryotic cell. 



The recombinant host cell of any of claims 17-19 which is a 
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1 27. A functional engineered fluorescent protein whose amino acid 

2 sequence is substantially identical to the amino acid sequence otAequorea green fluorescent 

3 protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 by at least the amino acid 

4 substitution T203X, wherein X is an aromatic amino acid selected from H, Y, W or F, said 

5 functional engineered fluorescent protein having a different fluorescent property than 

6 Aequorea green fluorescent protein. 

1 28. The protein of claim 27 wherein the amino acid sequence further 

2 comprises a substitution at S65, wherein the substitution is selected from S65G, S65T, 

3 S65A, S65L, S65C, S65V and S65I. 

1 29. The protein of claim 27 wherein the amino acid sequence differs by 

2 no more than the substitutions S65T/T203H; S65T/T203Y; S72A/F64L/S65GT203Y; 

3 S72A/S65G/V68L/T203Y; S65G/V68L/Q69K/S72A/T203Y; S65G/S72A/T203 Y; or 

4 S65G/S72A/T203W. 

1 30. The protein of claim 27 or 28 wherein the amino acid sequence 

2 further comprises a substitution at Y66, wherein the substitution is selected from Y66H, 

3 Y66F,andY66W. 

1 31. The protein of claim 27 or 28 wherein the amino acid sequence 

2 further comprises a folding mutation. 



1 
2 
3 



32. The protein of any of claims 27-29 which is a fusion protein wherein 
the fusion protein comprises a polypeptide of interest and the functional engineered 
fluorescent protein. 
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1 33. A fluorescently labelled antibody comprising an antibody coupled to 

2 a functional engineered fluorescent protein whose amino acid sequence is substantially 

3 identical to the amino acid sequence oiAequorea green fluorescent protein (SEQ ID NO:2) 

4 and which differs from SEQ ID NO:2 by at least the amino acid substitution T203X, 

5 wherein X is an aromatic amino acid selected from H, Y, W or F t said functional engineered 

6 fluorescent protein having a different fluorescent property than Aequorea green fluorescent 

7 protein. 

1 34. The fluorescently labelled antibody of claim 33 wherein the amino 

2 acid sequence further comprises a substitution at S65, wherein the substitution is selected 

3 from S65G, S65T, S65A, S65L, S65C, S65V and S65I. 

1 35. The fluorescently labelled antibody of claim 33 wherein the amino 

2 acid sequence differs by no more than the substitutions S65T/T203H; S65T/T203Y; 

3 S72A/F64L/S65G/T203Y; S72A/S65G/V68iyr203Y; S65G/V68L/Q69K/S72A/T203Y; 

4 S65G/S72A/T203 Y; or S65G/S72AyT203W. 

1 36. The fluorescently labelled antibody of claim 33 or 34 wherein the 

2 amino acid sequence further comprises a substitution at Y66, wherein the substitution is 

3 selected from Y66H, Y66F, and Y66W. 

1 37. The fluorescently labelled antibody of any of claims 33-35 which is a 

2 fusion protein wherein the fusion protein comprises the antibody fused to the functional 

3 engineered fluorescent protein. 
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1 38. A nucleic acid molecule comprising a nucleotide sequence encoding 

2 an antibody fused to a nucleotide sequence encoding a functional engineered fluorescent 

3 protein whose amino acid sequence is substantially identical to the amino acid sequence of 

4 Aequorea green fluorescent protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 

5 by at least the amino acid substitution T203X, wherein X is an aromatic amino acid selected 

6 from H, Y, W or F, said functional engineered fluorescent protein having a different 

7 fluorescent property than Aequorea green fluorescent protein. 



1 39. The nucleic acid molecule of claim 38 wherein the amino acid 

2 sequence further comprises a substitution at S65, wherein the substitution is selected from 

3 S65G, S65T, S65A, S65L, S65C, S65V and S65I. 



1 40. The nucleic acid molecule of claim 38 wherein the amino acid 

2 sequence differs by no more than the substitutions S65TjT203H; S65T/T203 Y; 

3 S72A/F64L/S65G/T203Y; S72A/S65G/V68L/T203Y; S65G/V6SL/Q69K/S72A/T203Y; 

4 S65G/S72A/T203Y; or S65G/S72A/T203W. 

1 4 i . The nucleic acid molecule of claim 38 or 39 wherein the amino acid 

2 sequence further comprises a substitution at Y66, wherein the substitution is selected from 

3 Y66H, Y66F, and Y66W. 



1 42. A fluorescently labelled nucleic acid probe comprising a nucleic acid 

2 probe coupled to a functional engineered fluorescent protein whose amino acid sequence is 

3 substantially identical to the amino acid sequence of Aequorea green fluorescent protein 

4 (SEQ ID NO:2) and which differs from SEQ ID NO:2 by at least the amino acid substitution 

5 T203X, wherein X is an aromatic amino acid selected from H, Y, W or F, said functional 

6 engineered fluorescent protein having a different fluorescent property than Aequorea green 

7 fluorescent protein. 
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1 43 . The fluorescently labelled nucleic acid probe of claim 42 wherein the 

2 amino acid sequence further comprises a substitution at S65, wherein the substitution is 

3 selected from S65G, S65T, S65A, S65L, S65C, S65V and S65I. 

1 44. The fluorescently labelled nucleic acid probe of claim 42 wherein the 

2 amino acid sequence differs by no more than the substitutions S65T/T203H; S65T/T203Y; 

3 S72A/F64L/S65G/T203Y; S72A/S65G/V68L/T203Y; S65G/V68UQ69K/S72A/T203Y; 

4 S65G/S72A7T203Y; or S65G/S72A/T203W. 

1 45. The nucleic acid molecule of claim 42 or 43 wherein the amino acid 

2 sequence further comprises a substitution at Y66, wherein the substitution is selected from 

3 Y66H,Y66F,andY66W. 

4 

1 46. A nucleic acid molecule comprising a nucleotide sequence encoding 

2 a functional engineered fluorescent protein whose amino acid sequence is substantially 

3 identical to the amino acid sequence otAequorea green fluorescent protein (SEQ ID NO:2) 

4 and which differs from SEQ ID NO:2 by at least an amino acid substitution at L42, V61 , 

5 T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220, E222 (not 

6 E222G), or V224, said functional engineered fluorescent protein having a different 

7 fluorescent property than Aequorea green fluorescent protein. 

1 47. The nucleic acid molecule of claim 46 wherein the amino acid 

2 substitution is: 

3 L42X, wherein X is selected from C, F, H, W and Y, 

4 V61X, wherein X is selected from F, Y, H and C, ' 

5 * T62X, wherein X is selected from A, V, F, S, D, N, Q, Y, H and C, 

6 V68X, wherein X is selected from F, Y and H, 

7 Q69X, wherein X is selected from K, R, E and G, 

8 Q94X, wherein X is selected from D, E, H, K and N, 
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9 N121X, wherein X is selected from F, H, W and Y, 

10 Y145X, wherein X is selected from W, C, F, L, E, H, K and Q, 

11 H148X, wherein X is selected from F, Y, N, K, Q and R, 

12 VI SOX, wherein X is selected from F, Y and H, 

1 3 Fl 65X, wherein X is selected from H, Q, W and Y, 

14 II 67X, wherein X is selected from F, Y and H, 

15 Q183X, wherein X is selected from H, Y, E and K, 

16 N185X, wherein X is selected from D, E, H, K and Q, 

1 7 L220X, wherein X is selected from H, N, Q and T, 

1 8 E222X, wherein X is selected from N and Q or 

1 9 V224X, wherein X is selected from H, N, Q, T, F t W and Y. 

20 . 

1 48. An expression vector comprising expression control sequences 

2 operatively linked to a nucleic acid molecule of comprising a nucleotide sequence encoding 

3 a functional engineered fluorescent protein whose amino acid sequence is substantially 

4 identical to the amino acid sequence of Aequorea green fluorescent protein (SEQ ID NO:2) 

5 and which differs from SEQ ID NO:2 by at least an amino acid substitution at L42, V61, 

6 T62, V68. Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220, E222 (not 

7 E222G), or V224, said functional engineered fluorescent protein having a different 

8 fluorescent property than Aequorea green fluorescent protein. 

1 49. The expression vector of claim 48 wherein the amino acid 

2 substitution is: 

3 L42X, wherein X is selected from C, F, H, W and Y, 

4 V61X, wherein X is selected from F, Y, H and C, 

5 T62X, wherein X is selected from A, V, F, S, D, N, Q, Y, H and C, 

6 V68X, wherein X is selected from F, Y and H, 

7 Q69X, wherein X is selected from K, R, E and G, 

8 Q94X, wherein X is selected from D t E, H t K and N f 
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9 N121X, wherein X is selected from F, H, W and Y, 

10 Y145X, wherein X is selected from W, C, F, L, E, H, K and Q, 

1 1 H148X, wherein X is selected from F, Y, N, K, Q and R, 

12 V150X, wherein X is selected from F, Y and H, 

1 3 F165X, wherein X is selected from H, Q, W and Y, 

1 4 II 67X, wherein X is selected from F, Y and H, 

15 Q183X, wherein X is selected from H, Y, E and K, 

16 Nl 85X, wherein X is selected from D, E, H, K and Q, 

1 7 L220X, wherein X is selected from H, N, Q and T, 

18 E222X, wherein X is selected from N and Q or 

1 9 V224X, wherein X is selected from H, N, Q, T, F, W and Y. 

1 50. A recombinant host cell comprising an expression vector that 

2 comprises expression control sequences operatively linked to a nucleic acid molecule 

3 comprising a nucleotide sequence encoding a functional engineered fluorescent protein 

4 whose amino acid sequence is substantially identical to the amino acid sequence of 

5 Aequorea green fluorescent protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 

6 by at least an amino acid substitution at L42, V61, T62, V68, Q69, Q94, N121 f Y145, 

7 H148, V150, F165, 1167, Q183, N185, L220, E222 (not E222G), or V224, said functional 

8 engineered fluorescent protein having a different fluorescent property than Aequorea green 

9 fluorescent protein. 

1 51. The recombinant host cell of claim 50 wherein the amino acid 

2 substitution is: 

3 L42X, wherein X is selected from C, F, H, W and Y, 

4 V61X, wherein X is selected from F, Y, H and C, 

5 T62X, wherein X is selected from A, V, F, S, D, N, Q, Y, H and C, 

6 V68X, wherein X is selected from F, Y and H, 

7 Q69X, wherein X is selected from K, R, E and G, 
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8 Q94X, wherein X is selected from D, E, H, K and N, 

9 . N121X, wherein X is selected from F, H, W and Y, 

10 Yl 45X, wherein X is selected from W, C, F, L, E, H f K and Q, 

1 1 H148X, wherein X is selected from F, Y, N, K, Q and R, 

12 V150X, wherein X is selected from F, Y and H, 

13 F 1 65X, wherein X is selected from H, Q, W and Y, 

1 4 I167X, wherein X is selected from F, Y and H, 

15 Ql 83X, wherein X is selected from H, Y, E and K, 

16 Nl 85X, wherein X is selected from D, E, H, K and Q, 

1 7 L220X, wherein X is selected from H, N, Q and T, 
1 e E222X, wherein X is selected from N and Q or 

1 9 V224X, wherein X is selected from H, N, Q, T, F, W and Y. 
20 

1 52. A functional engineered fluorescent protein whose amino acid 

2 sequence is substantially identical to the amino acid sequence oiAequorea green fluorescent 

3 protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 by at least an amino acid 

4 substitution at L42, V61, T62, V68, Q69, Q94, N121, Y145, H148 t V150, F165, 1167, 

5 Ql 83, Nl 85, L220, E222 (E222G), or V224, said functional engineered fluorescent protein 

6 having a different fluorescent property than Aequorea green fluorescent protein. 

1 53. The functional engineered fluorescent protein of claim 52 wherein the 

2 amino acid substitution is: 

3 L42X, wherein X is selected from C, F, H, W and Y, 

4 V61X, wherein X is selected from F t Y,H and C, ■ 

5 T62X, wherein X is selected from A, V, F, S, D, N, Q, Y, H and C, 

6 V68X, wherein X is selected from F, Y and H, 

7 Q69X, wherein X is selected from K, R, E and G, 

8 Q94X, wherein X is selected from D, E, H, K and N, 
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9 N121X, wherein X is selected from F, H, W and Y, 

1 0 Y145X, wherein X is selected from W, C, F, L, E, H, K and Q, 

11 H148X, wherein X is selected from F, Y, N, K, Q and R, 

12 VI 50X, wherein X is selected from F, Y and H, 

1 3 F165X, wherein X is selected from H, Q, W and Y, 

1 4 II 67X, wherein X is selected from F, Y and H, 

15 Q 1 83X, wherein X is selected from H, Y t E and K, 

16 N185X, wherein X is selected from D, E, H, K and Q, 

1 7 L220X, wherein X is selected from H, N, Q and T, 

1 8 E222X, wherein X is selected from N and Q or 

1 9 V224X, wherein X is selected from H, N, Q, T, F, W and Y. 

1 54. A fluorescently labelled antibody comprising an antibody coupled to 

2 a functional engineered fluorescent protein whose amino acid sequence is substantially 

3 identical to the amino acid sequence olAequorea green fluorescent protein (SEQ ID NO:2) 

4 and which differs from SEQ ID NO:2 by at least an amino acid substitution at L42, V61 , 

5 T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, L220, E222 (not 

6 E222G), or V224, said functional engineered fluorescent protein having a different 

7 fluorescent property than Aequorea green fluorescent protein. 

1 55. The antibody of claim 54 wherein the amino acid substitution is: 

2 L42X, wherein X is selected from C, F, H, W and Y, 

3 V61X, wherein X is selected from F, Y, H and C, 

4 T62X, wherein X is selected from A, V, F, S. D, N; Q, Y, H and C, 

5 V68X, wherein X is selected from F f Y and H, 

6 Q69X, wherein X is selected from K, R, E and G, 

7 Q94X, wherein X is selected from D, E, H, K and N, 

8 N121X, wherein X is selected from F, H, W and Y, 
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9 Y145X, wherein X is selected from W, C, F, L, E, H, K and Q, 

10 H148X, wherein X is selected from F, Y, N, K, Q and R, 

1 1 VI SOX, wherein X is selected from F, Y and H, 

12 F165X, wherein X is selected from H, Q, W and Y, 

1 3 I167X, wherein X is selected from F, Y and H, 

14 Q183X, wherein X is selected from H, Y, E and K, 

15 N185X, wherein X is selected from D, E, H, K and Q, 

1 6 L220X, wherein X is selected from H, N, Q and T, 

1 7 E222X, wherein X is selected from N and Q or 

1 8 V224X, wherein X is selected from H, N, Q, T, F, W and Y. 

1 56. A nucleic acid molecule comprising a nucleotide sequence encoding 

2 an antibody fused to a nucleotide sequence encoding a functional engineered fluorescent 

3 protein whose amino acid sequence is substantially identical to the amino acid sequence of 

4 Aequorea green fluorescent protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 

5 by at least an amino acid substitution at L42, V61, T62, V68, Q69, Q94 f N121, Y145, 

6 H148, V150, F165, 1167, Q183, N185, L220, E222 (not E222G) ? cr V224, said functional 

7 engineered fluorescent protein having a different fluorescent property than Aequorea green 

8 fluorescent protein. 

1 57. The nucleic acid molecule of claim 56 wherein the amino acid 

2 substitution is: 

3 L42X, wherein X is selected from C, F, H, W and Y, 

4 V61X, wherein X is selected from F,Y, Hand C, , 

5 T62X, wherein X is selected from A, V, F, S, D, N, Q, Y, H and C, 

6 V68X, wherein X is selected from F, Y and H, 

7 Q69X, wherein X is selected from K, R, E and G, 

8 Q94X, wherein X is selected from D, E, H, K and N, 

9 N121X, wherein X is selected from F, H, W and Y, 
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1 0 Y145X, wherein X is selected from W, C, F, L, E, H, K and Q, 

1 1 H148X, wherein X is selected from F, Y, N, K, Q and R, 

12 V 1 50X, wherein X is selected from F, Y and H, 

13 F165X, wherein X is selected from H, Q, W and Y, 

14 II 67X, wherein X is selected from F, Y and H, 

15 Ql 83X, wherein X is selected from H, Y, E and K, 

1 6 Ni 85X, wherein X is selected from D, E, H, K and Q, 

1 7 L220X, wherein X is selected from H, N, Q and T, 

1 8 E222X, wherein X is selected from N and Q or 

1 9 V224X, wherein X is selected from H, N, Q, T, F, W and Y. 

1 58. A fluorescently labelled nucleic acid probe comprising a nucleic acid 

2 probe coupled to a functional engineered fluorescent protein whose amino acid sequence is 

3 substantially identical to the amino acid sequence of Aequorea green fluorescent protein 

4 (SEQ ID NO:2) and which differs from SEQ ID N0:2 by at least an amino acid substitution 

5 at L42, V61, T62, V68, Q69, Q94, N121, Y145, H148, V150, F165, 1167, Q183, N185, 

6 L220, E222 (E222G), or V224, said functional engineered fluorescent protein having a 

7 different fluorescent property than Aequorea green fluorescent protein. 

1 59. The probe of claim 58 wherein the amino acid substitution is: 

2 L42X, wherein X is selected from C, F, H, W and Y, 

3 V6 IX, wherein X is selected from F, Y, H and C, 

4 T62X, wherein X is selected from A, V, F, S f D, N, Q, Y, H and C f 

5 V68X, wherein X is selected from F, Y and H, 

6 Q69X, wherein X is selected from K, R, E and G, 

7 Q94X, wherein X is selected from D, E t H, K and N, 

8 Nl 2 1 X, wherein X is selected from F, H, W and Y, 

9 Y145X, wherein X is selected from W, C, F, L, E, H t K and Q, 
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10 HI 48X, wherein X is selected from F, Y, N, K, Q and R, 

1 1 VI 50X, wherein X is selected from F, Y and H, 

12 F165X, wherein X is selected from H, Q, W and Y, 

1 3 II 67X, wherein X is selected from F, Y and H, 

14 Q 1 83X, wherein X is selected from H, Y, E and K, 

15 Nl 85X, wherein X is selected from D, E, H, K and Q, 

1 6 L220X, wherein X is selected from H, N, Q and T, 

1 7 E222X, wherein X is selected from N and Q or 

1 8 V224X, wherein X is selected from H, N, Q, T, F, W and Y. 

1 60. A method for determining whether a mixture contains a target 

2 comprising: 

3 contacting the mixture with a fluorescently labelled probe comprising 

4 a probe and a functional engineered fluorescent protein of claim 27 or claim 52; and 

5 determining whether the target has bound to the probe. 

1 61. The method of any of claim 60 the target is bound to a solid matrix. 
1 

2 62. A method for engineering a functional engineered fluorescent protein 

3 having a fluorescent property different than Aequorea green fluorescent protein, comprising 

4 substituting an amino acid that is located no more than 0.5 nm from any atom in the 

5 chromophore of an ^e^uorea-related green fluorescent protein with another amino acid; 

6 whereby the substitution alters a fluorescent property of the protein. 



1 

2 

3 



63. The method of claim 62 wherein the amino acid substitution alters the 
electronic environment of the chromophore. 
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1 64. A method for engineering a functional engineered fluorescent protein 

2 having a different fluorescent property than Aequorea green fluorescent protein comprising 

3 substituting amino acids in a loop domain of an Aequorea-relzttd green fluorescent protein 

4 with amino acids so as to create a consensus sequence for phosphorylation or for 

5 proteolysis. 

1 65. A method for producing fluorescence resonance energy transfer 

2 comprising: 

3 providing a donor molecule comprising a functional engineered 

4 fluorescent protein of claim 27 or claim 52; 

5 providing an appropriate acceptor molecule for the fluorescent 

6 protein; and 

7 bringing the donor molecule and the acceptor molecule into 

8 sufficiently close contact to allow fluorescence resonance energy transfer. 

1 66. A method for producing fluorescence resonance energy transfer 

2 comprising: 

3 providing an acceptor molecule comprising a functional engineered 

4 fluorescent protein of claim 27 or claim 52; 

5 providing an appropriate donor molecule for the fluorescent protein; 

6 and 

7 bringing the donor molecule and the acceptor molecule into 

8 sufficiently close contact to allow fluorescence resonance energy transfer. 

1 67. The method of claim 66 wherein the donor molecule is a engineered 

2 fluorescent protein whose amino acid sequence comprises the substitution T203I and the 

3 acceptor molecule is a nutant fluorescent protein whose amino acid sequence comprises the 

4 substitution T203X, wherein X is an aromatic amino acid selected from H, Y, W or F, said 

5 functional engineered fluorescent protein having a different fluorescent property than 

6 Aequorea green fluorescent protein. 
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68. A nucleic acid molecule comprising a nucleotide sequence encoding 
a functional engineered fluorescent protein whose amino acid sequence is substantially 
identical to the amino acid sequence of Aequorea green fluorescent protein (SEQ ID NO:2) 
and which differs from SEQ ID NO:2 by at least an amino acid substitution located no more 
than about 0.5 nm from the chromophore of the engineered fluorescent protein, wherein the 
substitution alters the electronic environment of the chromophore, whereby the functional 
engineered fluorescent protein has a different fluorescent property than Aequorea green 
fluorescent protein. 

69. An expression vector comprising expression control sequences 
operatively linked to a nucleotide sequence encoding a functional engineered fluorescent 
protein whose amino acid sequence is substantially identical to the amino acid sequence of 
Aequorea green fluorescent protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 
by at least an amino acid substitution located no more than about 0.5 nm from the 
chromophore of the engineered fluorescent protein, wherein the substitution alters the 
electronic environment of the chromophore, whereby the functional engineered fluorescent 
protein has a different fluorescent property than Aequorea green fluorescent protein. 

70. A functional engineered fluorescent protein whose amino acid 
sequence is substantially identical to the amino acid sequence of Aequorea green fluorescent 
protein (SEQ ID NO:2) and which differs from SEQ ID NO:2 by at least an amino acid 
substitution located no more than about 0.5 nm from the chromophore of the engineered 
fluorescent protein, wherein the substitution alters the electronic environment of the 
chromophore, whereby the functional engineered fluorescent protein has a different 
fluorescent property than Aequorea green fluorescent protein. 

71. A crystal of a protein comprising a fluorescent protein with an amino 
acid sequence substantially identical to SEQ ID NO: 2, wherein said crystal diffracts with at 
least a 2.0 to 3.0 angstrom resolution. 



WO 98/06737 FCMJS97/14593 

74 

1 72. The crystal of claim 7 1 , wherein the fluorescent protein has at least 

2 200 amino acids, a completeness value of at least 80% and has a crystal stability within 

3 0.5% of its unit cell dimensions. 

1 73. The crystal of claim 7 1 , wherein the amino acid sequence comprises a 

2 substitution at S65, wherein the substitution is selected from S65G, S65T, S65A, S65L, 

3 S65C, S65V and S65I. 

1 74. The crystal of claim 71, wherein said crystal has the following unit 

2 cell dimensions in angstroms: a = 51.8, b= 62.8 and c= 70.7 with a space group of P2 2 2 

3 and an □ angle of 90.00O, a □ angle of 90.000 and a □ angle of 90.00D and the crystal has 
A a diffraction limit where 90% or greater of the potential reflections can be used to determine 
5 the coordinates of the atoms. 

1 75 . A computational method of designing a fluoresent protein 

2 comprising: 

3 determining from a three dimensional model of a crystallized 

4 fluorescent protein comprising a fluorescent protein with a bound ligand, at least one 

5 interacting amino acid of the fluorescent protein that interacts with at least one first 
€ chemical moiety of the licand. and 

7 selecting at least one chemical modification of the first chemical 

8 moiety to produce a second chemical moiety with a structure to either decrease or increase 

9 an interaction between the interacting amino acid and the second chemical moiety compared 
10 to the interaction between the interacting amino acid and the first chemical moiety. 

1 76. The computational method of claim 75, further comprising generating 

2 the three dimensional model of the crystallized protein comprising a fluorescent protein 

3 with an amino acid sequence substantially identical to SEQ ID NO:2. 
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1 77. The computational method of claim 75, wherein the selecting selects 

2 the first chemical moiety that interacts with at least one of the amino acids listed in Figs. 5-1 

3 to 5-28. 

1 78. The computational method of claim 75, wherein the chemical 

2 modification enhances hydrogen bonding interaction, charge interaction, hydrophobic 

3 interaction, Van Der Waals interaction or dipole interaction between the second chemical 

4 moiety and the interacting amino acid compared to the first chemical moiety and the 

5 interacting amino acid. 

1 79. A computational method of modeling the three dimensional structure 

2 of a fluorescent protein comprising determining a three dimensional relationship between at 

3 least two atoms listed in the atomic coordinates of Figs. 5-1 to 5-28. 

1 80. The computational method of claim 79, wherein the determining 

2 comprises determining the three dimensional structure of a fluorescent protein with an 

3 amino acid sequence at least 80% identical to SEQ ID NO:2. 

4 

1 81. The computational method of claim 79, wherein the determining 

2 comprises determining the three dimensional structure of a fluorescent protein with an 

3 amino acid sequence at least 95% identical to SEQ ID NO:2. 

1 82. The computational method of claim 79, wherein the determining 

2 comprises determining the three dimensional relationship of at least 1 500 atoms listed in 

3 Figs. 5-1 to 5-28. 



1 
2 



83. A device comprising a storage device and, stored in the device, at 
least 10 atomic coordinates selected from the atomic coordinates listed in Figs. 5-1 to 5-28. 
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1 84. The device of claim 83, wherein the storage device is a computer 

2 readable device that stores code that receives as input the atomic coordinates. 
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1 85. The device of claim 84, wherein computer readable device is a floppy 

2 disk or a hard drive. 

3 86. A nucleic acid molecule comprising a nucleotide sequence encoding a functional 

4 engineered fluorescent protein whose amino acid sequence is substantially identical to 

5 the amino acid sequence of Aequorea green fluorescent protein (SEQ ID NO:2) and 

6 which differs from SEQ ID NO:2 by at least a substitution at Q69, wherein said 

7 functional engineered fluorescent protein has a different fluorescent property than 

8 Aequorea green fluorescent protein. 

9 87. The nucleic acid molecule of claim 86, wherein said substitution at Q69 is selected 

10 from the group of R, E and G. 

11 88. The nucleic acid molecule of claim 86, wherein said amino acid sequence further 

12 comprises a function mutation at S65. 

13 89. A nucleic acid molecule comprising a nucleotide sequence encoding a functional 

1 4 engineered fluorescent protein whose amino acid sequence is substantially identical to 

15 the amino acid sequence of Aequorea green fluorescent protein (SEQ ID NO:2) and 

1 6 which differs from SEQ ID NO:2 by at least a substitution at E222, but not including 

1 7 E222G, wherein said functional engineered fluorescent protein has a different 

1 8 fluorescent property than Aequorea green fluorescent protein. 

1 9 90. The nucleic acid molecule of claim 89, wherein said substitution at E222 is selected 
2 C from the group of N end Q. 

21 91. The nucleic acid molecule of claim 89, wherein said amino acid sequence further 

2 2 comprises a function mutation at F64 

23 92. A nucleic acid molecule comprising a nucleotide sequence encoding a functional 

2 4 engineered fluorescent protein whose amino acid sequence is substantially identical to 

25 the amino acid sequence of Aequorea green fluorescent protein (SEQ ID NO:2) and 

2 6 which differs from SEQ ID NO:2 by at least a substitution at Y145, wherein said 

27 functional engineered fluorescent protein has a different fluorescent property than 

2 8 Aequorea green fluorescent protein. 

2 9 93. The nucleic acid molecule of claim 92, wherein said substitution at Y145 is selected 

3 0 from the group of W, C, F, L, E, H, K and Q . 
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3 1 94. The nucleic acid molecule of claim 92, wherein said amino acid sequence further 
3 2 comprises a function mutation at Y66. 

33 95. A method of identifying a test chemical, comprising: 

34 contacting a test chemical a sample containing a biological entity labeled with a 

35 functional, engineered fluorescent protein or a polynucleotide encoding said functional, 
3 6 engineered fluorescent protein, and 

37 detecting fluorescence of said functional engineered fluorescent protein. 

38 96. The method of claim 95 , wherein said fluorescence in the presence of a test 

3 9 chemical is greater than in the absence of said test chemical. 

40 97. The method of claim 96, wherein said polynucleotide encoding said functional, 

4 1 engineered fluorescent protein is operatively linked to a genomic polynucleotide. 

42 98. The method of claim 95, wherein said functional, engineered fluorescent protein is 

4 3 fused to second functional protein. 

4 4 99. The method of claim 96, wherein said polynucleotide encoding said functional, 
4 5 engineered fluorescent protein is operatively linked to a response element. 

4 6 100. The method of claim 96, wherein said polynucleotide encoding said functional, 
4 7 engineered fluorescent protein is operatively linked to a response element in a 

4 8 mammalian cell. 



WO 98/06737 



1/36 



PCT/US97/14593 




Figure la 
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Figure lb 
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Figure 2b 
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Ui> SHQUcNCE CESCJHPTTCM: 

:0: 1 : ATC AST AAA CCA GAA CAA CTT TTC ACT CCA CTT CTC CCA A77 CTT CT7 -8 

•«*.-}. u ct Ser Lys civ ciu Clu Leu Phe Ihr Cly Val Vat Pro lie Leu Vat 
1 5 10 :5 

CAA TTA CAT CCT CAT CTT AAT CCC CAC AAA TTT TCT CTC ACT CCA CAC ?5 

Clu Leu Aso Civ aso Val Asn Cly His lys Phe Ser val Ser Cly Clu 
20 25 30 

CCT CAA CCT CAT CCA ACA UC CCA AAA CTT ACC CTT AAA T7T ATT TCC :i4 

Cly Clu Cly aso Ala Thr Tyr Cly Lys Leu Thr Leu Lys Phe He Cys 

35 ^0 45 

ACT ACT CCA AAA CTA CCT CTT CCA TCC CCA ACA CTT CTC ACT ACT TTC 192 

v Thr Thr Cly lys Leu Pro Val Pro Trp Pro Thr leu Val Thr Thr Phe 

50 55 60 

TCT TAT CCT CTT CAA TCC TTT TCA ACA TAC CCA CAT CAT ATC AAA CCC 240 

Ser Tyr Cly Val C'»n Cys Phe Ser Arg Tyr Pro Asp His Het lys Arg 
65 70 75 eo 

CAT CAC TTT TTC AAC AST CCC ATC CCC CAA CCT TAT CTA CAC CAA ACA Z*Z 

His Aso Phe Phe lys Ser Ala Met Pro Clu Cly Tyr Val Cln Clu Arg 
85 90 95 

ACT ATA TTT TTC AAA CAT CAC CCC AAC TAC AAC ACA CCT CCT CAA CTC 226 

Thr lie Phe Phe Lys Aso ass Cly Asn Tyr Lys Thr Arg Ala Clu Val 

100 105 no 

AAC TTT CAA CCT CAT ACC CTT CTT AAT ACA ATC CAC TTA AAA CCT ATT 2S4 

Lys Phe Clu Cly Aso Thr Leu Val Asn Arg Ue Clu leu Lys Cly tie 

115 120 125 

CAT TTT AAA CAA CAT CCA AAC ATT CTT CCA CAC AAA TTC CAA TAC AAC i32 

Asp Phe Lys Clu Aso Cly Asn lie leu Cly Mis lys leu Clu Tyr Asn 
130 135 HO 

TAT AAC TCA CAC AAT CTA TAC ATC ATC CCA CAC AAA CAA AAC AAT CCA -S3 

Tyr Asn Ser His asp. Val tyr Ue f.tl Ala aso Lys Cln Lys Asn Cly 
H5 150 15S *60 

ATC AAA CTT AAC TTC AAA ATT ACA CAC AAC ATT CAA CAT CCA ACC CTT Ml 

lie lys Val Asn P*e Lys :ie Arg «i$ Asn He Clu Aso Civ Ser vat 
165 170 175 

CAA CTA CCA CAC CAT TAT CAA CAA AAT ACT CCA ATT CCC CAT CCC CCT 376 

Cln Leu Ala Aso His Tyr Cir. Cln Asn Thr Pro lie Cly ASO CW p ro 

1A0 155 190 

CTC CTT TTA CCA CAC AAC CAT TAC CTC TCC ACA CAA TCT CCC CTT TCC 624 

Val Leu Leu Pro aso Asn Mrs Tyr ».eu Ser Thr Cln Ser Ala Leu Ser 

195 200 205 

AAA CAT CCC AAC CAA AAC ACA CAC CAC ATC CTC CTT CTT CAC TTT CTA HZ 

Lys Aso Pro Asn Clu Lys Arg Aso His Het Val Leu Leu Clu Phe Val 
210 215 220 

ACA CC? CCT CCC ATT ACA CAT CCC ATC CAT CAA CTA TAC AAA TA 7:7 

Thr Ala Ala Cly lie Thr r:% Cly aso Clu Leu Tyr tys 
225 230 235 
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:; 5 - t-;^ r.uunizes, r=2cr. jsao- . an acdLtLsr.al amine acii 

! t- - . j - « - - re - crovice cctiraai *cozar. sequence 

i 18 2" 36 «* S4 

• CCC GAC GAG CTC TTC ACC ZZZ CTC GTC ZZZ ATC CTG G7C CAG 



•il St: 



Lvs Gly CIu Glu uu Pne Thr Cly val Val ?r= lie Leu Val Glu 



£3 t: 81 9C 99 108 

ere sac ccc cac gta aac cgc c~c aag ttc agc gtc tcc gcc gag gcc gag ggc 

Leu Asp gIv Asp Val Asr. Gly His Lys Pne Ser Val Ser Gly Glu Cly Glu Gly 



117 



126 US 1*4 153 162 

~aC CG c AAC CTC ACC CTC AAG TTC ATC TSC ACC Aw^ GGC AAG CTG 

lap Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe lie Cys Thr Thr Gly Lys Leu 
171 xao i«* «■ 307 _ 216 

C^C CCC ACC CTG GTG ACC AGC TTC GGC TAG GGC GTG CAG *«C TTC 

?ro Val ?r= Trp Pro Thr Leu Val Thr Tr.r Phe Gly Tyr Gly Val Gin Cye Phe 

:25 234 243 252 261 270 

ZZZ ZZZ TAG CCC GAC CAC ATS AAG CAG CAC SAC TTC TTC AAG TCC GGC ATG CCC 

Ala Arc r,-r Pro as? His Met Lys Gin Hie Asp Pne Phe Lys Ser Ala Met Pro 

279 2 ftB 297 306 31S 334 

GAA GGC TAG CTC CAG GAG CCC ACC ATC TTC TTC AAG GAC CAC GGC AAC TAC AAG 

Glu Gly Tyr Val Clr. Glu Arg Thr He Phe Phe Lys Asp Asp Gly Asn Tyr Lye 

333 342 351 360 3*9 371 

ACC CCC GGC GAG CTG AAG TTC GAG GGC CAC ACC CTC GTG AAC CCC ATC GAG CTG 

Thr Ars Ala Glu Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu 

3 87 396 405 414 423 432 

AAG GGC ATC GAC TTC AAG GAG GAC GGC AAC ATC CTG GGC CAC AAG CTG GAG TAC 

Lvs Gly He As? Phe Lys Glu Asp Gly Asn He Leu Cly Eis Lys Leu Glu Tyr 

< 41 450 4S9 461 477 466 

JJ . C ^ c J— ^C CTC TAT ATC ATG CCC GAC AAG CAG AAG AAC CSC ATC 

Isn Tyr Ser His Asn Val Tyr He Met Ala Asp Lys Cin Lys Asn Gly He 

495 *04 513 522 531 540 

AAG GTG AAC TTT AAG ATC CCC CAC AAC ATC GAG GAC GGC AGC GTG CAG CTC GCC 

lys Val Asn Phe Lys lie Arg His Asn lie Glu Asp Gly ser Val Gin Leu Ala 
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AAC CAC TAC CTG AGC TAC CAG TCC GCC CTG AGC AAA GAG CCC AAC GAG AAG CCC 

Asn His Tyr Leu Ser Tyr Gin Ser Ala Leu Ser Lys Asp Pro Asn Glu Lys Arg 
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CAC ATG GTC CTG CTG GAG TTC GTG ACC GCC GCC GGG ATC ACT CAC CCC ATG 

_ s Ktz val Leu Leu Glu Phe Val Tr.r Ala A_a Gly He Thr Mis Gly Met 



Asp H 
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... - - - - - • - - - * 

Asp Leu Tyr Lys ••• 

Figure 4 



WO 98/06737 



PCT/US97/14593 



FIG 5-1 



8/36 



CRY ST 1 
ORIGXl 
ORIGX2 
ORIGX3 
SCALE I 
SCALE 2 
SCALE 3 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
. ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
rtTOM 
ATOM 



1.767 62.S45 

1.000000 0. 

o.ooooco 

0.000000 0. 

0.019317 0. 

0.000000 0. 

O.OOOOCO 0. 
N SER 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 

11 

12 

13 

14 

IS 

16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 

46 

47 

48 

49 

50 

51 

52 

53 

54 

55 

66 

57 

53 

59 

60 



CA 

C 

O 

CB 

OG 

N 

CA 

C 

O 

CB 

CG 

CD 

CE 

NZ 

N 

CA 

C 
O 
N 

CA 

C 

O 

CB 

CG 

CD 



SER 

SER 

SER 

SER 

SER 

LYS 

LYS 

LYS 

LYS 

LYS 

LYS 

LYS 

LYS 

LYS 

GLY 

GLY 

GLY 

GLY 

GLU 

GLU 

GLU 

GLU 

GLU 

GLU 

GLU 



OE1 GLU 
OE2 GLU 



CA 
C 

o 

CB 
CG 
N 

CA 

C 

O 

CB 

CG 



GLU 
GLU 
GLU 
GLU 
GLU 
GLU 
LEU 
LEU 
LZU 
LEU 
1EU 
LEU 



CD1 LEU 
CD 2 LEU 



N 

CA 

C 

O 

CB 

CG 



?HE 
rHE 
?HE 
?HE 

?HE 
?HE 



CD1 ?HE 
CD2 ?HE 
CE1 ?HE 
CE2 ?HE 
CZ ?HE 



CA 

c 
o 

C3 

OG1 

CG2 



THR 
THR 
THR 
THR 
THR 
TH?. 
THR 



70.6 
000000 
000000 
000000 
000000 
015912 
000000 

2 

2 

2 

2 

2 

2 

3 

3 

3 

3 

3 

3 

3 

3 

2 

4 
4 
4 
4 

5 
5 
5 
5 
5 
5 
c 



£ 

6 
6 
6 
6 

7 
7 
7 
7 
7 
7 
7 
8 
3 
8 
S 
3 
3 
3 
3 
8 
3 

3 

9 
o 

. a 



66 90. CO 
0.000000 
0.000000 

oooooo 

0.000000 

0.000000 

0.014151 

28.888 

27.638 

26.499 

26.606 

27.783 

27.690 

25.418 

24;141 

24.214 

24.107 

23.127 

21-768 

20.681 

20.711 

20.816 

24.318 

24.297 

25.425 

25.234 

26.606 

27.821 

27.S23 

27.850 

28.873 

30.337 

31.311 

31.508 

31.839 

26.833 

26.479 

25.561 

25.479 

25.780 

25.260 

24.864 

23.954 

24.693 

24.152 

23.050 

21.672 

21.597 

21.332 

25.944 

26.740 

27.818 

28.590 

27.309 

26.222 

25.672 

25.725 

24.661 



90. 



24. 

24. 
27, 
29. 
- c 



1?2 
798 
7C4 
709 
642 



9.409 
10.125 
9.639 
3.656 
11.635 
12.033 
10.403 
10.191 
10.266 
9.253 
11.240 
10.697 
11.731 
12.655 
14.103 
11.495 
11.798 
11.206 
10.923 
11.082 
10.598 
9.590 
9.803 
10.053 
10.461 
9.584 
9.677 
8.653 
3.499 
7.410 
7.837 
7.142 
6.330 
6.893 
3.966 
9.4S6 
10.061 
10.230 
10.548 
10.05= 
3.526 
10.591 
10.407 
11.122 
10.322 
10.856 
12.376 
12.335 
13.273 

14.290 
15.137 



90.00 
0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
52.301 
52.516 
51.644 
E0.915 
52.378 
51.012 
51.731 
51.036 
49.497 
48.774 
51.521 
51.949 
51.987 
53.243 
52.953 
49.015 
47.605 
46.796 
45.619 
47.420 
46 . 726 
45 - 616 
44.444 
47.718 
47.425 
48.170 
49.381 
47.403 
46.017 
45.150 
43.979 
42.955 
45*.992 
47.238 
44.123 
^3.089 
917 
836 
665 
098 
,074 
45.485 



= .07- 



h i. . 
40. 
43. 
44. 
44. 



42. 
41, 



1.00 85.05 
1.00 80.05 
1.00 85.35 
1.00 84.56 
1.00 70.97 
1.00 44.08 
1.00 87.71 
1.00 87.15 
1.00 76.86 
1.00 78.27 
1.00 89.44 
1.00 75.06 
1.00 76.53 
1.00 68.55 
1.00 46.24 
1.00 5 3.62 
1.00 45.97 
1.00 31.90 
1.00 33.63 
1.00 32.54 
1.00 32.57 
1.00 28.40 
1.00 26.12 
1.00 33.53 
1.00 41.36 
1.00 90.82 
1.00 74.80 
1.00100.00 
1.00 23.57 



1.00 
1.00 31 
1.00 
1.00 
1.00 



50 
10 
30.96 
35.64 
55.52 



. 157 
.159 
40.427 
19.600 
41.320 
42.163 
43.447 
41.139 
' 772 
4 1.499 
42. 794 
40. 699 
•42 . 17 5 
j 1 . 636 
:3.062 
40.392 
4 1.527 



1.00 22.26 
1.00 21.61 
1.00 16.90 
1.00 18.23 
1.00 22.41 
1.00 22.84 
1.00 21.64 
1.00 33.14 
1.00 20.75 
1.00 21.64 
1.00 30-59 
1.00 30.05 
1.00 16.95 
l* 5 - c 



.00 
.00 
.00 
.00 
.00 
.00 
.00 27 



i / . 
1 2 
15 
12 



25 
27 



19 
69 



1.00 34.92 

:.oo 45.:: 

1.00 50.55 

I. 00 44.60 

1.00 40. 4C 

1.00 29. 76 



WO 98/06737 



PCT/US97/14593 



9/36 
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20.979 26.524 31.767 1.00 11.90 

22.340 26.309 31.540 1.00 3.84 

17.440 27.845 26.498 1.00 13.24 

16.588 28.453 25.479 1.00 18.02 

15.645 29.460 26.118 1.00 20.14 

15.039 29.162 27.148 1.00 17.67 

15.737 27.386 24.801 1.00 22.67 

16.585 26.271 24.291 1.00 20.66 

15.024 28.002 23.641 1.00 33.79 

16.639 26.293 22.805 1.00 23.69 

15.564 30.653 25.561 1.00 14.68 

14.681 31.635 26.170 1.00 16.93 

13.323 31.352 25.628 1.00 24.18 

13.122 31.513 24.453 1.00 20.63 

15.063 33.116 25.885 1.00 16.85 

13.913 34.268 26.712 1.00 22.06 

12.424 30.871 26.484 1.00 27.31 

11.101 30.458 26.042 1 . 00 32.18 

10.106 31.572 25.803 1.00-37.51 

9.150 31.407 25.061 1.00 35.71 

10.537 29.417 26.972 1.00 23.66 

10.387 29.989 28.258 1.00 30.10 

11.512 28.226 27.022 1.00 29.98 

10.214 32.693 26.447 1.00 32.34 
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ATOM 396 CG PRO 56 

ATOM 297 CD PRO 5 6 

ATOM 398 t! TRP = 7 

ATOM 399 CA TRP =7 

ATOM 400 C TRP 5 7 

ATOM 401 O TRP 57 

ATOM 402 CB TRP 57 

ATOM 403 CG TRP 57 

ATOM 404 CD1 TRP 57 

ATOM 405 CD 2 TRP 57 

ATOM 406 NE1 TRP 57 

ATOM 407 CE2 TRP 57 

ATOM 408 CE3 TRP 57 

ATOM 409 CZ2 TRP 57 

ATOM 410 CZ3 TRP 57 

ATOM 411 CH2 TRP 57 

ATOM 412 N PRO 58 

ATOM 413 CA PRO 58 

ATOM 414 C PRO 58 

ATOM 415 O PRO 58 

ATOM 416 CB PRO 58 

ATOM 417 CG PRO 58 

ATOM 418 CD PRO 58 

ATOM 419 N THR 59 

ATOM 420 CA THR 59 

ATOM 421 C THR 59 

ATOM 422 O THR 59 

ATOM 423 CB THR 59 

ATOM 424 OG1 THR 59 

ATOM 425 CG2 THR 59 

ATOM 426 K LEU 60 

ATOM 427 CA LZU 60 

ATOM 428 C LEU 60 

ATOM 429 O LEU 60 

ATOM 430 CB LEU 60 

ATOM 431 CG LEU 60 

ATOM 432 CD1 LZU 60 

ATOM 433 CD 2 LZU 60 

ATOM 434 N VAL 61 

ATOM 435 CA VAL 61 

ATOM 436 C VAL 61 

ATOM 437 O VAL 61 

ATOM 438 CB VAL 61 

ATOM 439 CGI VAL 61 

ATOM 440 CG2 VAL 51 

ATOM 441 N THR 62 

ATOM 442 CA THR 62 

ATOM 443 C THR 62 

ATOM 444 O THR 62 

ATOM 445 CB THR 52 

ATOM 446 OG1 THR 62 

ATOM 447 CG2 THR 62 

ATOM 448 N THR 63 

ATOM 449 CA THR 53 

ATOM 450 C THR 63 

ATOM 451 O CHR 53 

ATOM 452 CS THR 53 

ATOM 453 OG1 THR 63 

ATOM 454 CG2 THR 53 

ATOM 455 N PHE 64 

ATOM 456 CA PHE 64 

ATOM 457 C PHE 64 

ATOM 4 58 O PHE 64 

ATOM 459 CB PHE 64 

ATOM 460 CG PHE 6 4 

ATOM 461 CD1 PHE 64 

ATOM 462 C02 PHE 54 



23.851 41.478 28.849 1.00 20.72 
22.525 41.379 29.578 1.00 18.66 
23.202 36.848 28.158 1.00 11.1.2 
23.354 25.458 28.595 1.00 12.55 
24.411 35.239 29.700 1-00 14.12 
24.178 34.586 30.709 1-00 11.49 
23.604 34.535 27.406 1.00 10.55 
22.235 34.237 26.641 1-00 12.65 
21.999 34.714 25.426 1-00 16.24 
21.221 33.327 27.013 1.00 12.50 
20.784 34.200 25-018 1.00 14.25 
20.215 33.354 25.963 1.00 14.65 
21.052 32.521 28.129 1.00 12.01 
19.148 32.583 26.007 1.00 14.36 
19.887 31.767 28.170 1-00 14.22 
18.945 31.818 27.128 1.00 10.01 
25.594 35.800 29.518 1.00 15.73 
26.629 35.616 30.503 1.00 9.52 
26.241 36.010 31.878 1.00 9.71 
26.760 35.467 32.825 1.00 11.70 
27.833 36.441 30.040 1.00 10.82 
27.597 36.748 28.582 1.00 18.50 
26.127 36.432 28.278 1.00 15.82 
25.226 26.977 32.021 1.00 7.54 
24.976 37.366 33.357 1.00 4.53 
24.228 26.258 24.137 1.00 8.41 
24.174 26.251 35.367 1.00 10.57 
24.157 2a. 691 33.334 1.00 16.64 
22.895 23.480 32.844 1.00 15.51 
24.917 39.731 32.542 1.00 15.76 
23.636 25.304 33.427 I. 00 11.99 
22.859 34.248 34.073 1.00 9.15 
23.657 32.944 34.385 1.00 15.62 
23.118 32.027 35.042 1.00 11.99 
21.645 33.914 33.203 1.00 7.67 
20.723 35.111 23.042 1.00 14. CS 
19.620 34.775 32.062 1.00 14.54 
20.142 25.456 24.394 1.00 1C.67 
24.693 22.837 33.917 1.00 11.27 
25.656 31.587 34.094 1.00 4.37 
25.678 31.013 25.496 1.00 6.02 
25.255 29.805 35.743 1.00 10.75 
27.050 21.643 23.406 1.00 7.14 
27.eBB 30-396 33.805 1.00 6.47 
26.e90 31.745 31.876 1.00 6.63 
26.053 31.643 36.442 1.00 7.02 
26.178 31.421 37.808 1.00 6.51 
24.862 30.954 38.410 1.00 9.22 
24.801 30.163 39.352 1.00 6.99 
26.816 22.520 38.660 1.00 16-57 
26.103 33.744 28.453 1.00 12.00 
28.297 22.708 38.225 1.00 £.86 
23.814 31.547 37.910 1.00 9.9e 

22.457 21.212 38.388 1.00 6.69 
22.033 29.830 37.865 1.00 3.14 
21.499 23.984 28.604 1.00 12.48 

21.458 22.312 27.925 1.00 11.14 
21.735 23.498 23.602 1.00 11.75 
20.024 21.897 33.296 1.00 9.21 
22.250 29.620 36.583 1.00 10.13 
21.595 23.371 25.995 1.00 3-00 
22.774 27.253 26.5ie 1.00 25.25 
22.213 26.147 26.761 1.00 9.64 
22.114 23.438 24.512 1.00 ~.33 
21.222 29.357 23.750 1.00 10.96 
21.724 29.954 22.593 1.00 r.15 
19.299 29-563 34.106 1.00 14.42 
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ATOM 530 O ARG 7 3 

ATOM 531 CB A KG 73 

ATOM 532 CG ARG 73 

ATOM 533 CD ARG 7 3 

ATOM 534 NE ARG 7 3 

ATOM 535 CZ ARG 73 

ATOM 536 WH1 ARG 73 

ATOM 537 NH2 ARG 73 

ATOM 538 N TYR 74 

ATOM 539 CA TYR 74 

ATOM 540 C TYR 74 

ATOM 541 O TYR 74 

ATOM 542 CB TYR 74 

ATOM 543 CG TYR 74 

ATOM 544 CD1 TYR 74 

ATOM 545 CD 2 TYR 74 

ATOM 546 CE1 TYR 74 

ATOM 547 CE2 TYR 74 

ATOM 548 CZ TYR 74 

ATOM 549 OH TYR 7 4 

ATOM 550 N PRO 7 5 

ATOM 551 CA PRO 7 5 

ATOM 552 C PRO 7 5 

ATOM S53 O PRO 75 

ATOM 554 CB PRO 75 

ATOM 555 CG PRO 75 

ATOM 556 CD PRO 7 5 

ATOM 557 N ASP 76 

ATOM 55 e CA AS? 7 6 

ATOM 559 C ASP 76 

ATOM 560 O ASP 76 

ATOM 561 CB ASP 76 

ATOM 562 CG ASP 76 

ATOM 563 OD1 ASP 76 

ATOM 564 OD2 ASP 7 6 

ATOM 565 N HIS 77 

ATOM 566 CA HIS 77 

ATOM 5 67 C HIS 77 

ATOM 5 68 O HIS 77 

ATOM 569 C3 HIS 77 

ATOM 570 CG HIS 77 

ATOM 571 ND1 HIS 77 

ATOM 572 CD 2 KIS 77 

ATOM 573 CE1 KIS 77 

ATOM 574 HE2 KIS 77 

ATOM 575 N MSE 78 

ATOM 576 CA MSE 78 

ATOM 577 C MSE 78 

ATOM 578 O MSE 78 

ATOM 579 CB MSE 78 

ATOM 580 CG MSE 78 

' ATOM 581 SE MSE 78 

ATOM 582 CE MSE 78 

ATOM S83 U LYS 79 

ATOM 584 CA LYS 79 

ATOM 585 C *LYS 79 

ATOM 586 O LYS 79 

ATOM 587 CB LYS 79 

ATOM 588 CG LYS 79 

ATOM 589 CD LYS 7 9 

ATOM 590 CE LYS 79 

ATOM 591 NZ LYS 79 

ATOM 592 r: ARG 30 

ATOM 593 CA ARG 30 

ATOM 594 C ARG 30 

ATOM 595 O ARG 50 

ATOM 596 C3 ARG 30 



36.132 18.376 42.599 1.00 16.14 
35.694 16.817 40.013 1.00 16.80 
36.549 15.616 40.460 1.00 20.13 
37.489 15.093 39.381 1.00 28.47 
38.743 15.859 29.260 1.00 25.48 
39.756 15.777 40.127 1.00 28.04 
39.688 15.004 41.195 1.00 28.76 
40.865 16.504 39.918 1.00 39.65 
35.151 16.561 43.424 1.00 12.05 
35.861 16.659 44.690 1.00 11.57 
36.946 15.566 44.721 1.00 25.02 
36.658 14.387 44.558 1.00 19.71 
34.978 16.528 45.934 1.00 15.51 
34.395 17.850 46.402 1.00 16.59 
33.455 18.546 45.631 1.00 14.44 
34.799 18.399 47.618 1.00 15.94 
32.901 19.756 46.059 1.00 7.99 
34.261 19.612 48.058 I . 00 18.29 
33.294 20.276 47.298 1.00 13.87 
32.829 21.507 47.738 1.00 18.39 
38.181 15.947 44.902 1.00 19.20 
39.213 14.940 44.995 1.00 18.42 
38.958 13.993 46.175 1.00 15.60 
38.373 14.361 47.174 1.00 11. S9 
40.514 15-681 45.195 1.00 18.31 
40.242 17.158 44,863 1.C0 24.81 

38.742 17.306 44.694 1.00 15.41 
39.433 12.756 46.038 1-00 18.63 
39.269 11.770 47.062 1.00 16.19 
39.581 12.280 48.431 1.00 15. S2 
38.862 12.042 49.389 1.00 17.35 
40.083 10.507 46.790 1.00 18.69 
39.826 9.432 47.825 1.00 24.04 
40.523 9.268 48.817 1.00 29.72 
38.732 3.743 47.584 1.00 40.96 
40.647 12.984 48.561 1.00 18.79 
40.978 13.4ie 49.877 1.00 19.25 
40.117 14.507 30.397 1.00 24.57 
40.205 14.826 51.551 1.00 27.15 
42.435 13.806 50.042 1.00 19.84 

42.743 15.035 49.322 1.00 17.31 
42.925 15.028 47.953 1.00 21.86 
42.925 15.295 49.774 1.00 18.70 
43.203 16.289 47.593 1.00 17.49 
43.213 17.069 43.668 1.00 18.11 
39.277 15.069 49.565 1.00 25.36 
38.412 16.140 50.026 1.00 24.65 
36.920 15.774 50.066 1.00 26.47 
36.070 16.636 50.260 1.00 28.16 
38.596 17.331 49.121 1.00 26.38 
39.803 18.177 49.406 1.00 27.01 
39.987 19.608 43.117 1.00 43.09 
38.874 20.873 49.044 1.00 27.11 
36.606 14.509 49.856 1.00 18.68 
35.216 14.061 49.853 I - 00 21.54 
34.406 14.449 51.082 1.00 20.21 
33.186 14.652 31.025 1.00 21.08 
35.152 12.581 49.612 1.00 23.48 
35.859 12.225 43.317 1.00 41.09 
35.159 11.134 47.535 1.00 34.65 
35.796 13.831 46.131 1.00 53.46 
35.084 11.549 43.030 1.00 49.53 
35.069 14.542 32.213 1.00 19.77 
34.365 14.374 33.434 1.00 20.13 
33.393 13.311 =3.431 1.00 26.42 



23.25 1 15.7. 



E4.467 1.00 23.51 
=.4.7C0 1.00 24.53 
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36.204 15.620 55.034 1.00 29.71 
36.964 15.344 56.335 1.00 61.30 
36.551 16.230 57.415 1.00 71.14 
37.398 16.882 58.192 1.00100.00 
38.714 16.758 58.040 1.00100.00 
36.917 17.679 59.155 1.00 99.06 
34.275 17.121 52.473 1.00 18.77 
33.903 18.547 52.499 1.00 19.60 
32.841 18.883 51.486 1.00 18.62 
32.557 20.043 51.295 1.00 17.76 
35.129 19.472 52.283 1.00 20.39 
36.221 19.224 53.305 1.00 28.02 
36.127 19.701 54.618 1.00 30.59 
37.392 1S.535 53.202 1.00 29.02 
37.218 19.308 5S.265 1.00 26.24 
37.991 18.603 54.452 i.00 28.18 
32.298 17.843 50.841 1.00 12.20 
31.358 18.011 49.769 1.00 13.24 
29.922 18.148 50.259 1.00 24.30 
29.175 17.195 50.243 1.00 16.55 
31.480 16.917 48.730 1.00 12.23 
30.642 17.209 47.518 1.00 9.92 
29.870 18.134 47.459 1.00 20.31 
30.938 16.466 46.507 1.00 11.12 
29.566 19.353 50.705 1.00 23.66 
28.220 19.634 51.201 1.00 20.23 
27.154 19.333 50.168 1.00 20.93 
26.116 13.733 50.503 i.00 15.97 
28.077 21.106 51.666 1.00 19.59 
26.624 21.613 51.805 1.00 16.91 
25.946 21.498 53.021 1.00 17.76 
25.968 22.236 50.734 1.00 18.88 
24.635 21.960 53.156 1.00 24.13 
24.650 22.690 50.840 1.00 19.24 
24.001 22.575 52.068 1.00 20.67 
27.432 19.784 48.921 1.00 14.06 
26.515 19.693 47.809 1.00 12.56 
25.893 18.332 47.602 1.00.24.56 
24.674 13.200 47.534 1.00 21.55 
27.085 20.265 46.513 1.00 13.44 
27.630 21.645 46.721 1.00 14.27 
29.001 21.845 46.890 1.00 15.17 
26.781 22.753 46.752 1.00 13.48 
29.520 23-129 47.073 1.00 14.63 
27.276 24.041 46.969 1.00 16.34 
28.650 24.221 47.137 1.00 15.77 
26.738 17.330 47.482 1.00 14.07 
26.294 15.985 47.283 1.00 13.30 
25.657 15.371 48.547 1.00 13.43 
24.773 14.509 48.429 1.00 18.46 
27.434 15.089 46.757 1.00 17.38 
27.873 15.372 45.323 1.00 13.93 
28.969 14.381 44.888 1.00 13.23 
29.766 14.819 43.662 1.00 10.36 
30.319 16.185 43.773 1.00 12.92 
26.119 15.795 49.752 1.00 11.03 
25.610 15.267 50.998 1.00 12.09 
24.156 15.639 51.240 1.00 21.58 
23.452 14.979 52.013 1.00 19.89 
26.448 15.661 52.208 1.00 15.45 
26.308 17.042 52.495 1.00 22.05 
23.705 16.698 50.5S2 1.00 15.09 
22.333 17.138 50.762 1.00 19.52 
21.337 16.399 49.870 1.00 13.60 
20.162 16.557 50.040 1.00 19.55 
22.204 13.647 50.632 1.00 19.23 
21.23S 15.536 43.976 1.00 14.05 
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88 21.007 14.796 48.035 I. 00 15,32 

88 20.496 13.448 48.579 1.00 21.43 

88 21.109 12.876 49.457 1.00 22.22 

88 21.848 14.593 46.791 1.00 16.92 

88 22.263 15.891 46.131 1.00 1C.56 

88 20.737 16.894 45.394 1.00 31.99 

88 21.313 18.684 45.748 1.00 28.66 

89 19.363 12.930 48.084 1.00 14.73 
89 18.552 13.475 47.008 1.00 14.60 
89 17*572 14.611 47.385 1.00 12.10 
89 17.085 15.301 46.493 1.00 18.06 
89 17.733 12.294 46.494 1-00 17.00 
89 17.726 11.261 47.607 1.00 15.33 

89 18.844 11.642 48.560 1.00 17.16 

90 17.278 14.795 48.695 1.00 14.63 
90 16.348 15.838 49.157 1.00 20.63 
90 16.701 17.229 48.645 1.00 25.59 
90 15.833 18.042 48.368 1.00 21.57 
90 16.031 15.816 50.682 1.00 22.21 
90 15.782 14.403 51.228 1.00 37.59 
90 17.071 13.641 51.447 1.00 83.49 
90 18.179 14.151 51.342 1.00 54.80 

90 16.875 12.373 51.749 1.00 64.65 

91 17.977 17.509 48.510 1.00 21.39 
91 18.394 18.769 47.906 1.00 17.77 
51 18.673 19.911 48.839 1.00 12.17 

91 18-769 19.764 50.055 1.00 16.51 

92 18.861 21.086 48.225 1.00 13.02 
92 19.143 22.266 48.994 1.00 1C. 33 
92 18.575 23.478 48.347 1.00 9.87 
92 18.270 23.483 47.144 1.00 15. S9 
92 20.678 22.488 49.278 1.00 15.40 
92 21.546 22.468 48.012 1.00 15.13 
92 21.620 23.576 47.166 1.00 14.75 
92 22.317 21.350 47.683 1.00 16.09 
92 22.404 23.S61 46.005 1.00 6.50 
92 23.067 21.300 46.504 1.00 15.12 
92 23.156 22.424 45.682 1.00 13.13 

92 23.944 22.393 44.517 1.00 13.37 

93 18.447 24.504 49.189 1.00 11.93 
93 13.025 25.822 48.778 1.00 14.74 
93 19.221 26.666 48.625 1.00 16. CO 
93 20.172 26.625 49.451 1.00 15.16 
93 17.073 26.480 49.791 1.00 23.45 
93 16.855 27.937 49.413 1.00 26.05 

93 15.716 25.764 49.771 1.00 22.50 

94 19.361 27.345 47.521 1.00 13.73 
94 20.480 28.195 47.227 1.00 1C.53 
94 19.948 29.583 46.998 1.00 12.23 
94 19.153 29.788 46.061 1.00 15.52 
94 21.232 27.727 45.934 1.00 "-95 
94 22.361 29.708 45.469 1.00 11.37 
94 23.431 27.999 44.632 1.00 12.04 
94 23.805 26.879 44.946 1.00 13.60 

94 23.719 23.527 43.449 1.00 7.98 

95 20.396 30-531 47.820 1.00 11.73 
95 19.974 31.899 47.643 1.00 13.47 
95 21.149 32.804 47.398 1.00 13.42 
95 22.206 32.623 47.985 1.00 19.23 
95 19.277 32.427 48.878 1.00 13.32 
95 18.009 31.684 49.215 1.00 23.46 
95 17.657 32.016 50.622 1.00 45. 53 
95 17.574 33.166 51.011 1.0010C.C0 

17.764 30.987 51.423 1.00 61.33 

=6 30.929 33.83e 46.601 1.00 15.51 

56 11.973 34.733 46.342 1.00 16.37 

56 31.510 35.195 46.206 1.00 15.34 
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103 24.238 48.274 34.688 1.00 19.05 

103 22.774 48.809 36.283 1.00 23.89 

104 22.612 47.542 28.646 1.00 20.17 
104 21.598 46.900 29.498 1.00 20.22 
104 22.055 45,619 40.180 1.00 24.68 

104 23.202 45.211 40.085 1.00 18.06 

105 21.125 44.967 40.872 1.00 15.71 
105 21.425 43.703 41.510 1.00 8.89 
105 20.399 42.620 41.181 1.00 21.85 
105 19.255 42.911 40.824 1.00 15.17 
105 21.605 43.840 43.001 1.00 3.58 
105 20.359 44.366 43.697 1.00 43.57 
105 19.565 43.601 44.259 1.00 36.67 

105 20.17B 45.674 43.659 1.00 36.47 

106 20.826 41.365 41.328 1.00 16.80 
106 19.966 40.219 41.156 1.00 13.90 
106 19.763 39.543 42.475 1.00 11.05 
106 20.678 39.404 43.281 1.00 13.86 
106 20.547 39.128 40.246 1.00 15.88 
106 20.619 39.398 38.793 1-00 15.57 
106 19.952 40.458 28.173 1.00 13.14 
106 21.373 38.524 28.006 1.00 13.35 
106 20.038 40.632 26.793 1.00 13.44 
106 21.481 38.692 26.623 1.00 10.87 
106 20.814 39.751 36.025 1.00 15.93 

106 20.970 39.931 34.670 1.00 17.32 

107 18.533 39.115 42.709 1.00 12.39 
107 18.194 38-349 43.897 1.00 11.51 
107 17.619 37.037 43.397 1.00 17.25 
107 16.704 37.010 42.562 1.00 13.14 
107 17.217 39.063 44.823 1.00 14.82 
107 17.860 39.631 46.060 1.00 40.71 

107 18.528 40.974 45.793 1-00 43.48 

108 18.205 35.951 43.835 1.00 14.95 
108 17.774 34.658 43.352 1.00 11.97 
108 17.463 33.696 44.468 1.00 15.81 
108 18.043 33.734 45.582 1.00 13.68 
10e 18.847 34.034 42.410 1.00 23.81 
108 20.064 33.791 43.137 1.00 13.88 

108 19.123 34.968 41.264 i.00 13.04 

109 16.560 32.804 44.154 1.00 13.57 
109 16.212 31.751 45.048 1.00 12.56 
109 15.939 20.498 44.254 1.00 13.07 
109 15.239 30.509 43.249 1.00 12.52 
109 15.069 32.100 45.959 1.00 17.32 
109 14.767 30.995 46.932 1.00 17.92 
109 13.400 31.160 47.610 1.00 19.99 
109 12.821 29.854 47.883 1.00 36.05 
109 12.968 29.244 49.035 1.00 55.71 
109 13.630 29.815 50.046 1.00 44.11 

109 12.432 28.041 49.195 1.00 94.34 

110 16.577 29.414 44.635 1.00 13.26 
110 16.377 28.207 43.870 1.00 12.68 
110 16.346 26.979 44.724 1.00 13.15 
110 16.829 26.965 45.869 1.00 16.75 

110 17.465 23.059 42.e22 1.00 17.31 

111 15.770 25.939 44.175 1.00 15.39 
HI 15.741 24-655 44.823 1-00 15.24 
111 16.438 23.678 43.926 1.00 12.08 
111 16.086 23.545 42.771 1.00 15-70 
111 14.303 24.122 44.993 1.00 19.20 
111 13.744 24.242 46.299 1.00 23.62 
HI 12.247 24.280 46.372 1.00 60.99 
1\1 11.539 22.843 45.432 1.00 76.05 

111 11.742 24.956 47.2SO 1.00 54.87 
:i2 17.433 22.965 44.457 1.00 10.73 

112 13.063 21.973 ^3.621 1.00 10.98 



WO 98/06737 



PCT/US97/14593 



FIG 5-14 



21/36 



ATOM 


365 


C 


VAL 


112 


17.968 


20. 630 


ATOM 


366 


0 


VAL 


112 


18.271 


20. 438 


ATOM 


667 


CB 


VAL 


112 


19.428 


22 . 358 


ATOM 


368 


CGI 


VAL 


112 


19.966 


23 . 704 


ATOM 


869 


CG2 


VAL 


112 


20.452 


21.232 


ATOM 


870 


N 


LYS 


113 


17.415 


19.732 


ATOM 


871 


CA 


LYS 


113 


17.175 


18.421 


ATOM 


872 


C 


LYS 


113 


16.822 


17.485 


ATOM 


373 


O 


LYS 


113 


16.695 


17.893 


ATOM 


874 


CB 


LYS 


113 


16.032 


18.497 


ATOM 


875 


CG 


LYS 


113 


14.792 


19.084 


ATOM 


876 


CD 


LYS 


113 


13.509 


18.321 


ATOM 


877 


CE 


LYS 


113 


12.526 


19.134 


ATOM 


878 


NZ 


LYS 


113 


12.379 


20.518 


ATOM 


879 


N 


PHE 


114 


16.683 


16.208 


ATOM 


880 


CA 


PHE 


114 


16.325 


15.175 


ATOM 


881 


C 


PHE 


114 


14.806 


14.975 


ATOM 


882 


o 


PHE 


114 


14.110 


14.878 


ATOM 


883 


CB 


PHE 


114 


16.866 


13.838 


ATOM 


884 


CG 


PHE 


114 


18.231 


13.536 


ATOM 


885 


CD1 


PHE 


114 


19.344 


13.795 


ATOM 


886 


CD 2 


PHE 


114 


18.403 


13.009 


ATOM 


887 


CE1 


PHE 


114 


20.627 


13.500 


ATOM 


888 


CE2 


PHE 


114 


19.673 


12.708 


ATOM 


889 


CZ 


PHE 


114 


20.780 


12.953 


ATOM 


890 


»« 


GLU 


IIS 


14.354 


14.819 


ATOM 


891 


CA 


GLU 


115 


12.97B 


14.473 


ATOM 


892 


C 


GLU 


115 


13-121 


13.193 


ATOM 


893 


o 


GLU 


115 


13.434 


13.207 


ATOM 


894 


CB 


GLU 


115 


12.348 


15.481 


ATOM 


895 


CG 


GLU 


115 


11.856 


16.747 


ATOM 


896 


CD 


GLU 


115 


10.742 


16.460 


ATOM 


897 


OE1 


GLU 


115 


10.181 


15.395 


ATOM 


898 


OE2 


GLU 


115 


10.460 


17.461 


ATOM 


899 


N 


GLY 


116 


13.005 


12.087 


ATOM 


900 


CA 


GLY 


116 


13.225 


10.861 


ATOM 


901 


C 


GLY 


116 


14.727 


10.767 


ATOM 


902 


o 


GLY 


116 


15.516 


10.922 


ATOM 


903 


N 


ASP 


117 


15.137 


10.564 


ATOM 


904 


CA 


ASP 


117 


16.572 


10.462 


ATOM 


905 


C 


ASP 


117 


17.237 


11.677 


ATOM 


906 


O 


ASP 


117 


18.423 


11.672 


ATOM 


907 


CB 


ASP 


117 


17.055 


9.074 


ATOM 


908 


CG 


ASP 


117 


16.624 


8.677 


ATOM 


909 


OD1 ASP 


117 


16.230 


9.468 


ATOM 


910 


O02 ASP 


117 


16.805 


7.391 


ATOM 


911 


N 


THR 


118 


16.463 


12.729 


ATOM 


912 


CA 


THR 


118 


16.889 


13.981 


ATOM 


913 


C 


THR 


118 


17.186 


14.988 


ATOM 


914 


0 


THR 


118 


16.498 


15.064 


, ATOM 


915 


CB 


THR 


118 


15.806 


14.497 


•ATOM 


916 


OG1 THR 


118 


15.552 


13.508 


ATOM 


917 


CG2 THR 


118 


16.217 


15.793 


ATOM 


918 


N 


LEU 


119 


18.284 


15 .681 


ATOM 


919 


CA 


LEU 


119 


18.679 


16 . 706 


ATOM 


920 


C 


* LEU 


119 


18.036 


17 .992 


ATOM 


921 


O 


LEU 


119 


18 . 194 


o - Job 


ATOM 


922 


CB 


LEU 


119 


20.243 


15.815 


ATOM 


923 


CG 


LEU 


119 


20.845 


17.678 


ATOM 


924 


CD1 LEU 


:i9 


20.701 


19.167 


ATOM 


925 


CD2 LEU 


:i9 


20.366 


17.311 


ATOM 


926 


N 


VAL 


120 


17.230 


13.595 


ATOM 


927 


CA 


VAL 


120 


16.466 


19.797 


ATOM 


928 


C 


VAL 


12.0 


16.929 


11.039 


ATOM 


929 


o 


VAL 


120 


17. 135 


11.039 


ATOM 


930 


C3 


VAL 


120 


14.939 


19.566 


ATOM 


931 


CGI VAL 


120 


14.133 


10.790 



44.261 

45.432 

43.012 

43.487 

43.078 

43.516 

44.045 

42.931 

41.808 

45.036 

44.376 

44.703 

45.528 

45.036 

43.267 

42.317 

42.181 

43.160 

42.838 

42.338 

43.139 

41.056 

42.665 

40.572 

41.387 

40.956 

40.642 

39.906 

38.730 

39.667 

40.376 

41.342 

41.431 

42.079 

40.585 

39.e69 

39.641 

40.570 

38.439 

38.233 

37.598 

27.265 

37.733 

36.348 

35.495 

36.130 

37.493 

36.910 

37.976 

33.996 

35.952 

34.990 

35.275 

37.805 

38.759 

28.259 

2 7.091 

33.839 

29.951 

39.669 

41.333 

19.170 

23.859 

29.537 

40.762 

29.082 

23.642 



1.00 3. 62 
1.00 15.63 
1.00 22.75 
1.00 16.69 
1.00 18.47 
1.00 14.67 
1.00 16.41 
1.00 7.11 
1.00 16.27 
1.00 22.50 
1.00 20.40 
1.00 44.65 
1.00 54.02 
1.00100.00 
1.00 10.09 
1.00 11.41 
1.00 14.18 
1.00 15.03 
1.00 12.89 
1.00 16.80 
1.00 18.61 
1.00 19.50 
1.00 22.78 
1.00 25.35 
1.00 23.99 
1.00 15.29 
1.00 11.40 
1.00 13.30 
1.00 18. 
1.00 9. 
1.00 19.54 
1.00 38.12 
1.00 34.84 
1.00 27.88 
1.00 14.51 
1.00 15.91 
1.00 23.59 
1.00 19.35 
1.00 20.25 
1.00 28.00 
1.00 22.29 
1.00 21.23 
1.00 33.06 
1.00 55.04 
1.00 S9.57 
1.00 82.48 
1.00 19.62 
1.00 18.21 
1.00 18.92 
1.00 15.94 
1.00 19.03 
1.00 21.42 
1.00 15.49 
1.00 13.66 



,72 
.68 



1.00 13 

1.00 3 

i.oo 12 

1.00 12 

1.00 3 

1.00 10 

1.00 7 



,50 
.81 
,49 
.25 
.90 
. 11 
.36 



; .00 13.34 
1.00 12.77 



.00 
.00 



z . so 
13.22 



1.00 17.50 
1.00 17.52 



WO 98/06737 



PCT/US97/14593 



FIG 5-15 



ATOM 


932 


CG2 


VAL 


120 


niun 


933 


H 


ASN 


121 


a. TOM 


934 


CA 


ASN 


121 


rv i \jrc 


935 


Q 


ASN 


121 


ATUW 


TOO 


Q 


ASN 


121 




Q 17 


CB 


ASN 


121 


ATOM 


flip 
7 JO 






121 


ATOM 


QIQ 
«J5 


\JLJ x 




121 


ATOM 


94U 


MHO 


» CM 

now 


A 4. ± 


ATOM 


Oil 1 
941 


M 




1 22 


ATOM 


Oil "3 
94^ 


LA 






ATOM 




c 




1 22 


ATOM 


944 


o 


nor 




ATOM 


die 
945 


uo 


nor 


122 


ATOM 


94o 


CG 


n Or* 


122 


ATOM 


94 / 


UU 


HUP 


122 


ATOM 


a a a 
94B 


KT 

N 


TT X* 




ATOM 


a a a 




TT IT 


1 x -j 


ATOM 


95u 


C 


TT V 


121 


ATOM 


Q CI 


Pi 
U 


TT IT 


123 


ATOM 


Q C 


Co 




12 1 


ATOM 


eel 

9 53 




TT TT 

ILL 


1 2 1 

-i. X J 


ATOM 


954 


CG2 


TT f 




ATOM 


o e c 

955 


CD1 


T T 


iiJ 


ATOM 




IN 




124 


ATOM 


957 


CA 


GLU 


124 


ATOM 


958 


C 


GLU 


124 


ATOM 


959 


O 


GLU 


124 


ATOM 


960 


CB 


GLU 


124 


ATOM 


961 


CG 


GLU 


124 


ATOM 


962 


N 


LEU 


125 


ATOM 


963 


CA 


LEU 


125 


ATOM 


964 


C 


LEU 


125 


ATOM 


965 


O 


LEU 


125 


ATOM 


966 


CB 


LEU 


125 


ATOM 


967 


CG 


LEU 


125 



ATOM 


968 


CD1 


LEU 


125 


ATOM 


969 


CO 2 


LEU 


125 


ATOM 


970 


N 


LYS 


126 


ATOM 


971 


CA 


LYS 


126 


ATOM 


972 


C 


LYS 


126 


ATOM 


973 


o 


LYS 


126 


ATOM 


974 


CB 


LYS 


126 


ATOM 


975 


CG 


LYS 


126 


ATOM 


976 


CD 


LYS 


126 


ATOM 


977 


CE 


LYS 


126 


ATOM 


978 


N2 


LYS 


126 


ATOM 


979 


N 


GLY- 


127 


ATOM 


980 


CA 


GLY 


127 


ATOM 


981 


C 


GLY 


127 


.ATOM 


982 


O 


GLY 


127 


ATOM 


983 


N 


ILE 


128 


ATOM 


984 


CA 


ILE 


128 


ATOM 


985 


C 


ILE 


128 


ATOM 


986 


o 


ILE 


128 


ATOM 


987 


CB 


ILE 


128 


ATOM 


988 


CGI 


ILE 


128 


ATOM 


989 


CG2 


ILE 


128 


ATOM 


990 


CD1 


ILE 


128 


ATOM 


991 


N 


ASP 


129 


ATOM 


992 


CA 


ASP 


129 


ATOM 


993 


C 


ASP 


129 


ATOM 


994 


0 


ASP 


129 


ATOM 


995 


C3 


ASP 


129 


ATOM 


996 


CG 


ASP 


129 


ATOM 


997 


OD1 


ASP 


129 


ATOM 


998 


OD2 


ASP 


129 



22/36 



14.501 18.351 28.246 1.00 15.35 

17.067 22.111 38.839 1.00 12.24 
17.424 23.405 29.400 1.00 11.73 
16.301 24.382 39.060 1.00 11.18 
16.195 24.802 37.934 1.00 11.09 
18.753 23.928 38.791 1.00 11.41 
19.201 25.261 39.367 1.00 11.07 
18.773 25.654 40.461 1.00 12.06 
20.124 25.938 38.670 1.00 11.90 
15,470 24.706 40.029 1.00 13.69 
14.348 25.610 39.825 1.00 12.99 
14.622 26.946 40.498 1.00 5.89 
14.749 27.011 41.723 1.00 14.47 

13.068 25.025 40.417 1.00 15.99 
12.478 23.921 39.S89 1.00 30.23 
11.282 23.244 40.281 1.00 60.61 
14.663 27.992 39.680 1.00 11.46 
15.030 29.340 40.095 1.00 11.86 
13.991 30.450 39.835 1.00 10.54 
13.370 30.535 38.765 1.00 12.83 
16.296 29.757 29.292 1.00 15.41 
17.316 28.585 39.180 1.00 12.27 
16.944 30.993 39.918 1.00 14. CI 
17.652 28.242 37.743 1.00 7.74 
13.953 31.358 40.793 1.00 11.36 
13.139 22.572 40.700 1.00 15.20 
14.168 33.713 40.811 1.00 11.93 
14.919 33.797 41.780 1.00 15.61 
12.028 32.677 41.751 1.00 19.74 
12.387 33.337 43.089 1.00 72.94 
14.183 34.550 39.808 1.00 12.19 
15.092 35.654 39.767 1.00 15.00 
14.420 37.011 39.722 1.00 19.25 
13.563 37.267 38.893 1.00 13.41 
15.976 35.533 38.510 1.00 14.29 
17.003 26.683 33.375 1.00 17.55 
18.302 36.083 27.849 1.00 13.46 
16.511 37.732 37.367 1.00 12.09 
14.890 37.897 40.554 1.00 12.73 
14.391 39.260 40.579 1.00 15.92 
15.563 40.276 40.445 1.00 13.52 
16.489 40.246 41.246 1.00 19.85 
13.611 39.487 41.877 1.00 17.21 
12.853 40.786 41.923 1.00 33.94 
11.356 40.601 41.675 1.00 60.87 
10.652 41.929 41.521 1.00 52.70 
11.229 42.988 42.367 1.00 47.22 
15-514 41.127 39.411 1.00 18.71 
16.551 42.151 39.121 1.00 17.32 
16.012 43.572 39.272 1.00 25.32 
14.981 43.908 38.693 1.00 20.14 
16.706 44.404 40.070 1.00 ie.42 
16.282 45.787 40.243 1.00 21.04 
17.405 46.789 40.196 1.00 25.93 
18.562 46.496 40.429 1.00 19.37 
15.432 46.052 41.504 1.00 23.32 
16.408 45.888 42.701 1.00 23.36 
14.272 45.120 41.577 1.00 23.95 
15.324 46.391 44.013 1.00 29.89 
15.999 43.002 29.918 1.00 20.26 
17.361 49.124 29.882 1.00 13.53 
13.364 49.086 23.801 1.00 20.25 
19.949 49.632 23.953 1.00 24.23 
13.498 49.407 41.253 1.00 20.57 
17.545 =0.077 42.226 1.00 43.70 
16.652 50.242 41.333 1.00 49.42 
17.770 49.740 42.475 1.00 23.07 
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22,931 44.207 15.018 1.00 22.20 
21.708 43.551 15.550 1.00 25.52 
21.666 42.132 25.785 1.00 25.67 
20.525 44.092 25.927 1.00 28.09 
20.474 41.918 25.275 1.00 27.50 
19.766 43.044 25.382 1.00 29.53 
26.187 45.311 26.525 1.00 23.51 
27.569 45.638 26.197 1.00 25.82 
28.600 44.537 26.560 1.00 26.28 
29.824 44.730 26.391 1.00 22.29 
27.977 46.937 26.911 1.00 27.56 
27.269 48.217 26.445 1.00 31.19 
27.234 49.254 27.582 1.00 51.32 
26.924 50.696 27.169 1.00 47.92 
27.112 51.663 28.284 1.00 73.76 
28.116 43.403 27.115 1.00 19.33 
28.987 42.296 27.559 1.00 14.32 
29.366 41.401 26.427 1.00 20.75 
28.526 41.087 25.620 1.00 19.01 
28.313 41.488 28.676 1.00 12.53 
27.979 42.352 29.875 1.00 17.54 
27.700 41.469 21.070 1.00 24.81 
29.116 43.210 20.182 1.00 27.50 
30.644 40.937 25.346 1.00 14.76 
31.040 40.059 25.311 1.00 13.43 
30.462 33.691 25.641 1.00 15.69 
30.175 38.393 26.787 1.00 16.43 
32.558 39.866 25.204 1.00 14.73 
33.290 41.077 24.624 1.00 29.30 
34-787 41.003 24.825 1.00 56.32 
35,340 40.098 25.420 1.00 31.70 
35.430 42.015 24.321 1.00 34.10 
30.365 37.873 24.632 1.00 16.30 
29.837 36.542 24.764 1.00 20.04 
30.925 35.559 25.049 1.00 12.46 
31.327 34.792 24.193 1.00 16.99 
29.035 35.113 23.498 1.00 20.96 
28.187 34.857 23.674 1.00 16.12 
27-040 34.859 24.472 1.00 18.24 
28.512 23.684 22.986 1.00 12.87 
26.257 33.708 24.615 1.00 17.91 
27.735 32.530 23.104 1.00 16.58 
26.603 22.551 23.914 1.00 17.35 
25.861 31.432 24.035 1.00 23.40 
31.392 35.597 26.251 1.00 12.40 
32.428 34.7C3 25.689 1.00 12.05 
32.433 34.675 23.193 1.00 15.75 
31.637 35.369 23.837 1.00 14.58 
33.823 35.038 25.068 1.00 18.45 

34.310 26.445 25.374- 1.00 18.98 
34.150 36.951 27.488 1.00 20.34 
34.891 37.085 25.382 1.00 23.02 

33.311 23.876 23.773 1.00 12.16 
33.343 23.765 20.195 1.00 10.63 
34.765 23.453 20.730 1.00 14.58 
35.510 22.751 20.090 1.00 13.83 
32.404 32.627 20.571 1.00 9.76 
31.698 22.916 21.326 1.00 11.86 
30.515 23.65S 21.808 1.00 9.04 
32.138 22.419 23.030 1.00 10.07 
29.860 23.948 22.999 1.00 3.36 
31.544 22.707 24.235 1.00 15.32 
30.375 23.469 24.206 1.00 11.69 
29.730 22.725 25.275 1.00 15.22 
25.036 23.951 21.923 1.00 15.25 
26.415 23.727 22.560 1.00 17.00 
36.426 22.613 22.339 1 .00 19.63 
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146 35.395 32.043 33.848 1-00 14.71 

146 36.844 25.062 33.235 1.00 11.89 

146 37.013 36.147 32.215 1.00 35.45 

146 37.533 35.890 31.105 1.00 31.63 

146 36.547 37.349 32.553 1.00 19.74 

147 37.630 32.338 34.201 1.00 12.09 
147 37.804 31.320 35.266 1.00 8.55 
147 37.769 31.999 36.575 1.00 11.70 
147 38.219 33.125 36.671 1.00 16.56 
147 39.148 30.540 35.129 1.00 9.87 

147 39.212 29.980 33.828 1.00 33.20 

148 37.195 31.365 37.583 1.00 5.53 
148 37.090 31.998 38.850 1.00 8.06 
148 37.346 31.038 39.949 1.00 11.30 
148 37.328 29.844 39.754 1.00 16.87 
148 35.648 32.608 39.067 1.00 11.29 
148 35.215 33.554 37.972 1.00 10.84 
148 34.548 33.121 36.836 1.00 12.77 
148 35.403 34.887 37.851 1.00 8.82 
148 34.389 34.178 36.060 1.00 8.84 

148 34.882 35.242 36.647 1.00 8.82 

149 37.534 31.579 41.125 1.00 10.80 
149 37.626 20.805 42.345 1.00 13.35 
149 36.409 21.157 43.2C5 1.00 14.47 
149 36.099 22.320 43.387 1.00 18.17 
149 38.8S0 21.093 43.184 1.00 12.67 
149 40.148 30.822 42.424 1.00 20.21 
149 40.993 31.713 42.281 1.00 56.34 

149 40.210 29.641 41.818 1.00 16.44 

150 35.773 30.144 43.741 1.00 14.65 
150 34.588 30.262 44.552 1.00 12.92 
150 34.910 29.806 45.943 1.00 16.30 
150 35.257 28.665 46.147 1.00 17.83 
150 33.482 29.382 43.914 1.00 15.22 
ISO 32.252 29.297 44.765 1.00 14.09 

150 33.172 29.791 42.464 1.00 10.94 

151 34.796 20.716 46.900 1.00 17.64 
151 35.139 30.440 48.275 1.00 18.31 
151 34.003 29.917 49.117 1.00 24.35 
151 32.963 30.536 49.239 1.00 20.83 
151 35.793 31.681 48.920 1.00 20.15 
151 37.025 32.033 48.141 1.00 25.86 
151 37.003 32.989 47.127 1.00 26.00 
151 38.200 31.315 48.355 1.00 28.66 
151 38.151 33.234 46.369 1.00 33.73 
151 39.360 31.550 47.619 1.00 29.01 
151 39.325 32.512 46.618 1.00 29.55 

151 40.449 32.737 45.877 1.00 38.69 

152 34.250 28.791 49.753 1.00 17.71 
152 33.255 28.159 50.572 1.00 14.12 
152 33.619 28.056 52.000 1.00 18.51 
152 34.728 27.703 52.336 1.00 22.05 
152 32.979 26.776 50.060 1.00 16.66 
152 32.431 26.875 48.638 1.00 11.30 
152 32.017 26.078 51.021 1.00 17.96 

152 32.377 25.559 47.949 1.00 13.48 
1=3 32.623 28.278 52.841 1.00 17.41 

153 32.789 23.162 54.269 1.00 22.61 
153 31.534 27.648 54.916 1.00 27.31 
1=3- 30.433 27.831 54.396 1.00 20.50 

153 33.145 29.490 54.855 1.00 19.11 
1=3 34.010 20.302 53.957 1.00100.00 
132 34.060 22.117 54.524 1.00100.00 
15,2 33.463 21.798 56.330 1.00 30.27 

154 31.733 25.933 56.052 1.00 22.29 
154 30.669 25.339 55.795 1.00 22.66 
154 29.820 27.401 57.552 1.00 29.00 
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30,274 28,457 57.960 1-00 27. C2 
31.224 25.336 57.744 1.00 19. TS 
28.566 27.063 57.726 1.00 29.43 
27.669 27,887 58.484 1.00 32.13 
26.976 27.019 59.511 1.00 44.51 
25.898 26.492 59.274 1.00 39.56 
26.659 28,617 57,597 1.00 31.70 
26.140 29.851 58.247 1.00 49.89 
26.595 30.297 59.277 1.00 46.67 
25.187 30.422 57,565 1.00 76.07 
27.646 26.816 60.629 1.00 46.37 
27.116 25.954 61.654 1-00 53.23 
25.750 26.369 62.224 1.00 65.62 
25.012 25.520 62.703 1.00 65.54 
28.147 25.612 62.725 1.00 59.51 
25.398 27.655 62.138 1.00 68.32 
24.119 28.135 62-670 1.00 73.00 
22.891 27.767 61.817 1.00 87.53 
21.778 27.547 62-325 1.00 96,16 
23.095 27.725 60.506 1.00 72.49 
22.040 27.386 59.593 I. 00 66.13 
22.235 25.985 59.040 1.00 58.21 
21.447 25.524 58-226 1.00 59.85 
23.303 25.294 59.502 1.00 40.00 
23.582 23.944 59.012 1.00 36.67 
23.755 24.002 57.500 1.00 34.11 
23.223 23.167 56.754 1.00 31.69 
22.431 22.952 59.367 1.00 46.42 
22.842 21.485 59.428 1.00 80.46 
23.850 21.121 60.054 1.00100.00 
22.003 20.620 58.854 1.00 58.09 
24.474 25.044 57.062 1.00 22.34 
24.686 25.247 55.663 1.00 17.58 
26.055 25.791 55.433 1.00 26.75 
26.960 25.664 56.271 1.00 25.57 
26.200 26.395 54.277 1.00 23.23 
27.442 26.975 53.909 1.00 16.45 
27.200 28.354 53.295 1.00 15. 77 
26.118 28.680 52.962 1.00 15.95 
28.129 26.117 52.864 1.00 19.27 
27.237 26.016 51.619 1.00 18.53 
28.351 24.735 53.445 1.00 21.96 
28.009 25.614 50.350 1.00 14.44 
28.226 29.169 53.471 1.00 17.56 
28.187 30.508 52.948 1.00 14.42 
29.216 30.524 51.857 1.00 17.73 
30.249 29.875 51.991 1.00 19.16 
28.480 31.540 54.055 1.00 18.15 
27.221 31.963 54.796 1.00 42. C3 
27.493 32.787 56.039 1.00 70.42 
28.911 31.176 50.759 1.00 13.74 
29.798 31.201 49.629 1.00 11.95 
29.928 32.610 49.103 1.00 19.33 
28.944 33.318 48.983 1.00 19.84 
29.249 30.268 48.532 1.00 15.59 
30.105 30.277 47.261 1.00 12.29 
29.029 23.852 49.077 1.00 15.55 
31.146 32.999 48.733 1.00 14.03 
31.332 24.310 48.195 1.00 15.55 
32.396 24.271 47.050 1.00 20. OS 
23.268 23.386 46.988 1.00 22.49 
21.732 25.325 49.308 1.00 20.52 
23.196 35.697 49.330 1.00 89.21 
24.020 24.987 49.929 I.OOIOC.C: 
23.515 26.831 43.7C0 1.00 91.46 
22.244 25.207 46.109 1.00 17.27 
22. 122 25.301 44.953 1 . CO 1C55 
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FIG 5-20 
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FIG 5-:i 
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172 31.813 45.165 23.509 1.00 22.46 

172 31.122 46.531 23.786 1.00 58.53 

172 29.871 46.783 22.933 1.00100.00 

172 29.415 45.970 22.156 1.00100.00 

172 29.370 47.982 23.149 1.00100.00 

173 34.277 46.934 25.034 1.00 24.41 
173 35.292 47.978 24.852 1.00 25.03 
173 36.651 47.624 25.455 1.00 33.40 
173 37.561 48.451 25.518 1.00 30.42 
173 34.822 49.319 25.401 1.00 23.30 
173 34.743 49.358 26.912 1.00 32.47 
173 34.406 50.355 27.513 1.00 37.58 

173 34.949 48.196 27.504 1.00 49.22 

174 36.766 46.410 25.956 1.00 23.87 
174 38.019 45.994 26.537 1.00 21.30 
174 38.012 46.090 28.044 1.00 19-99 

174 38.927 45.585 28.709 1.00 20.45 

175 36.972 46.767 28.598 1.00 13.88 
175 36.898 46.931 30.034 1.00 8.70 
175 36.296 4S.728 30.765 1.00 17.30 
175 36.136 44.655 30.175 1.00 18.77 
175 36.288 48.235 30.450 1.00 14-07 

175 36.360 48.316 31.865 1.00 24.79 

176 35.953 45.912 32.051 1.00 13.74 
176 35.415 44.826 32.864 1.00 16.46 
176 34.191 45.204 33.703 1.00 22.46 
176 34.159 46.254 34.334 1.00 21.31 
176 36.477 44.285 33.818 1.00 24.43 
176 35.647 43.344 34.827 1.00 27.45 

176 37.532 43.536 33.035 1.00 25.65 

177 33.234 44.269 23.787 1.00 15.47 
177 32.048 44.430 34.647 1.00 15.40 
177 32.102 43.457 35.813 1.00 10.60 
177 32.027 42.243 35.634 1.00 13.55 
177 30.709 44.283 33.872 1.00 IS. 57 
177 29.468 44.294 34.828 1.00 19.13 
177 29.1C3 45.673 25.361 1.00 14.91 
177 28.759 46.583 34.574 1.00 20.17 

177 29.123 45.821 26.690 1.00 17.28 

178 32.227 43.993 27.018 1.00 8.17 
178 32.313 43.180 38.1B1 1.00 16.66 
178 30.954 42.786 28.712 1.00 20-93 
178 30.033 43.608 28.753 1.00 14.66 
178 33.089 43.896 39.293 1.00 20.63 
178 34.286 43.110 39.815 1.00 39.28 
178 33.831 42.087 40.852 l.OO 45.14 

178 35.018 42.426 38.648 1.00 39.52 

179 30.869 41.550 39.171 1.00 16.72 
179 29.652 41.033 39.754 1.00 15.55 
179 29.932 40.277 41.040 1.00 15.70 
179 30.337 39.119 41.028 1.00 15.91 

179 28.853 40.197 38.731 1.00 14.08 

180 29.694 40.946 42.155 1.00 8-88 
180 29.897 40.407 43.480 1.00 7.18 
180 28.802 39.460 43.891 1.00 17-07 
180 27.651 39.844 43.987 1.00 18.22 
180 29.934 41.509 44.509 1.00 13.06 
180 31.285 41.902 44.935 1.00 46.28 

180 31.981 41.206 45.655 1.00 60.46 

130 31.574 43.121 44.560 1.00 46-61 

181 29.173 38.242 44.197 1.00 14-51 
181 28.213 37.223 44.575 1.00 10.49 
181 28.213 36.897 -46.049 1.00 14.23 
13.1 29.255 36.530 46.607 1.00 17.40 

131 28.450 23.915 43.769 1.00 9.89 
131 23.077 25.972 42.228 1.00 10.23 
131 23.606 26.926 41.455 1.00 12.24 



WO 98/06737 



PCT/US97/14593 



29/36 



FIG 5-2: 
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27.279 
28.093 
27.314 
27.029 
26.848 
25.871 
24.819 
26.359 
27.421 
27.521 
28.389 
28.532 
29.418 
29.480 
30.461 
26.246 
25.410 
25.289 
26.260 
25.984 
25.651 
26.411 
26.975 
26.361 
24.080 
23.760 
23.033 
22.219 
22.949 
23.364 
22.312 
21.159 
22.689 
23.418 
22.831 
22.421 
23.176 
23.761 
24.110 
24.704 
23.830 
21.227 
20.707 
19.976 
19.389 
19.856 
18.874 
20.753 
20.101 
19.504 
17.988 
17.390 
19.977 
20.840 
20.786 
17.382 
15.907 
15.470 
14.596 
15.385 
15.555 
13.916 
15.139 
16.142 
15.833 
16.339 
17.016 



35.146 
36.678 
35.594 
36.897 
36.518 
35.393 
35.520 
37.664 
38.693 
39.715 
38.616 
40.674 
39.559 
40.594 
41.534 
34.277 
33.104 
32.311 
32.174 
32.219 
30.688 
29.884 
30.454 
28.553 
31.739 
30.829 
29.582 
29.640 
31.444 
32.855 
33.517 
33.054 
34.625 
23.446 
27.155 
26.463 
26.402 
26.212 
26.696 
27.758 
25.868 
25.941 
25.227 
24.010 
23.991 
26.100 
26.752 
27.121 
22.951 
21.683 
21.757 
22.518 
20.682 
21.449 
22.918 
20.957 
20.855 
19.766 
19.966 
20.574 
21.775 
20-141 
21.471 
13.618 
17.531 
17.817 
13.810 



41.606 
40.269 
40.316 
46.668 
48.062 
48.089 
47.532 
48.934 
49.062 
48.120 
50.064 
48.197 
50.147 
49.216 
49.308 
48.686 
48.583 
49.863 
50.623 
47.422 
47.457 
46.389 
45.456 
46.473 
50.055 
51.168 
50.653 
49.747 
52.330 
52.768 
53.657 
53.752 
54.286 
51.207 
50.887 
52.166 
53.172 
50.119 
48.748 
48.592 
47.763 
52.139 
53.288 
52.824 
51.730 
54.206 
53.446 
54.903 
53.620 
53.269 
53.288 
54.071 
54.337 
55.338 
54.949 
52.453 
52.407 
=3.389 
=4.202 
=0.991 
50.1C2 
50.981 



48.660 
53.352 
=4.233 
55.702 
=5.967 



1.00 10.42 
1.00 9.97 
1.00 9.38 
1.00 10.40 
1.00 13.86 
1.00 20.61 
1.00 16.35 
1.00 21.12 
1.00 34.16 
1.00 46.06 
1.00 38.56 
1.00 57.53 
1.00 40.76 
1.00 54.61 
1.00 61.92 
1.00 17.63 
1.00 16.37 
1.00 21.39 
1.00 19.86 
1.00 13.33 
1.00 17.38 
1.00 17.27 
1.00 13.80 
1.00 13.94 
1.00 19.74 
1.00 16.55 
1.00 13.60 
1.00 18.01 
1.00 20.11 
1.00 74.84 
1.00100.00 
1.00 97.99 
1.00100.00 
1.00 14.75 
1.00 13.86 
1.00 16.06 
1.00 17.39 
1.00 15.20 
1.00 12.75 
1.00 22.56 
1.00 17.70 
1.00 18.01 
1.00 17.40 
1.00 23.63 
1.00 24.57 
1.00 28.82 
1.00 35.65 
1.00 28.86 
1.00 22.40 
1.00 20.28 
1.00 22.41 
1.00 25.07 
1.00 19.79 
1.00 26.98 
1.00 22.04 
1.00 18.77 
1.00 20.12 
1.00 31.58 
1.00 38.58 
1.00 21.52 
1.00 16.10 
00 28.85 
00 15.31 
32.39 
32.94 
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16.003 16.928 So. 617 1.00 49.41 
16.392 17.047 58.021 1.00 55.01 
17 556 16.115 58.338 1.00 56.15 
18.083 16.100 59.463 1.00 58.30 
15.195 16.734 58.955 1.00 63.89 
14.592 15.365 =8.686 1.00 99.67 
14 5 Q 9 14.466 59.514 1.00100.00 
14.088 15.240 57.470 1.00100.00 
17.921 15.312 57.323 1.00 47.20 
19.015 14.347 57.419 1.00 44.96 
20.359 15.044 57.587 1.00 34.43 
20.452 16.266 57.438 1.00 29.96 
21.402 14.264 57.905 1-00 27.26 
22.737 14.834 58.100 1.00 24.01 
23.444 15.274 56.787 1.00 20.55 
23.323 14.648 55.740 1-00 23.84 
23.583 13.764 58.825 1.00 21.00 
22.739 12.501 SB. 915 1.00 27.49 
21.330 12.863 58.448 1.00 27.26 
24.193 16.363 56.892 1.00 17.87 
24.964 16.902 55.792 1.00 19.51 
26.380 17.108 56.249 1.00 22.37 
26.663 17.189 57.443 1.00 23.84 
24.449 18.245 55.256 1.00 25.24 
23.059 18.118 54.632 1.00 21.90 
24.497 19.322 56.346 1.00 24.81 
27.253 17.241 55.277 1.00 19.04 
28.654 17.438 55.516 1.00 20.29 
29.006 13.930 55.571 1.00 18.71 
28.907 19.615 54.591 1.00 20.13 
29.412 16.806 54.327 1.00 22.92 
29.994 15.423 54.542 1.00 30.60 
29.227 14.642 55.595 1.00 25.19 
30.048 14.672 53.211 1.00 25.61 
29.453 19.430 56.713 1.00 17.39 
29.881 20.808 56.785 1.00 18.83 
^.239 20.837 56.579 1.00 23.32 
32.161 20.152 £7.281 1.00 21.93 
29.489 21.525 58.072 1.00 22.20 
28.055 21.349 58.444 1.00 26.40 
27.937 21. SOB 59.941 1.00 31.99 
27.225 22.395 57.726 1.00 26.90 
31.789 21.610 55.597 1.00 21.58 
33.177 21.666 55.154 1.00 22.17 
34.080 22.623 55.892 1.00 29.56 
33.635 23.588 56.490 1.00 29.04 
23.054 22.265 53.752 1.00 22.77 
31.761 23.104 53.735 1.00 18.99 
30.910 22.567 54.861 1.00 16.42 
35.379 22.410 55.716 1.00 22.95 
36.364 23.370 56.134 1.00 19.71 
36.556 24.295 54.931 1.00 24.74 
36.251 23.913 53.800 1.00 24.88 
37.711 22.730 56.446 1.00 22.28 
37.690 21.913 57.687 1.00 43.93 
36.912 22.117 58.608 1.00 53.47 
38.634 11.006 57.694 1.00 31.58 
37.062 25.501 55.168 1.00 19.74 
37.254 26.470 54.118 1.00 15.33 
37.974 25.e89 52.971 1.00 19.61 
"8.958 25.236 53.134 1.00 22.69 
39.012 27.704 54.614 1.00 24.48 
, -7.225 23.504 55.632 1.00 52.21 
26.1C7 23.174 55.961 1.00 34.54 
"7.254 29.556 55.150 1.00 55.11 
:7.s62 15.125 51. SOI 1.00 16.30 
•3.071 25.627 50.616 1.00 15.30 
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199 37.496 26.357 

199 36.757 27.295 

199 37.988 24.103 

199 36.597 23.628 

199 35.695 23.491 

199 35.987 23.282 

199 34.561 23.052 

199 34.716 22.905 

200 37.879 25.998 
200 37.334 26.689 
200 37.207 25.824 
200 37.793 24.751 
200 38.030 28.011 
200 39.382 27.745 
200 39.543 27.526 
200 40.473 27.605 
200 40.800 27.222 
200 41.739 27.314 
200 41.896 27.132 

200 43.153 26.820 

201 36.393 26.309 
201 36.147 25.680 
201 36.753 26.532 
201 36.619 27.753 
201 34.628 25.518 
201 33.749 25.027 
201 32.293 24.938 

201 34.196 23.635 

202 37.407 25.868 
202 38.047 26.490 
202 37.222 26.189 
202 36.919 25.038 
202 39.485 25.987 

202 40.067 26.353 

203 36.798 27.241 
203 35.879 27,067 
203 35.417 27.521 
203 37.192 23.472 
203 34.565 27.892 
203 34.911 29.260 

203 33.935 27.557 

204 35.913 26.883 
204 36.173 27.271 
204 34.956 26.980 
204 34.334 25.932 
204 37.475 26.696 
204 37.271 25.371 
204 38.588 24.722 
204 39.011 24.716 

204 39.276 24.241 

205 34.619 27.913 
205 33.447 27.762 
205 33.654 23.307 
205 34.282 29.337 
205 32.197 23.445 

205 32.121 28.406 

206 33.065 27.630 
206 33.079 23.029 
206 31.623 23.192 
206 30.809 27.306 

206 33.751 2S.S26 

207 31.335 23.320 
207 30.036 29.617 
207 30.070 2?. 445 
207 31.014 Z? .840 
207 29.530 21.057 
207 29.744 .31.493 
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49.450 1.00 14.85 

49.643 1.00 16.45 

50.471 1.00 16.53 

50.218 1-00 16.65 

51.244 1.00 17.85 

49.048 1.00 18.67 

50.688 1.00 19.45 

49.364 1.00 18.74 

48.247 1.00 12.56 

47.100 1.00 14.01 

45.870 1.00 15.57 

45.768 1.00 20.20 

46.779 1.00 19.79 

46.202 1.00 22.25 

44.835 1.00 22. S3 

47.057 1.00 25.73 

44.317 1.00 35.51 

46.559 1.00 29.34 

45.186 1.00 54.14 

44.703 1.00 62.66 

44.946 1.00 15.07 

43.678 1.00 11.01 

42.593 1-00 17.30 

42.610 1.00 20.19 

43.354 1.00 10.09 
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FIG 5-25 



ATOM 


1 Cfll 

lbui 


CD 1 


T TIT 




ATOM 






T TTT 


y07 


ATOM 


i.604 


N 




". rip 


ATOM 


1605 


CA 




* Oft 


ATOM 


i606 


C 


B 

SER 


■3 np 


ATOM 


1607 


O 


SER 




ATOM 


1608 


CB 


SER 


2Uo 


ATOM 


1609 


OG 


SER 


2\JO 


ATOM 


1610 


N 


LYS 


** ftO 

^09 


ATOM 


1611 


CA 


LYS 


209 


ATOM 


1612 


C 


LYS 


209 


ATOM 


1613 


O 


LYS 


209 


ATOM 


1614 


CB 


LYS 




ATOM 


1615 


CG 


LYS 


209 


ATOM 


1616 


CD 


LYS 


209 


ATOM 


1617 


CE 


LYS 


209 


ATOM 


1618 


NZ 


LYS 


209 


ATOM 


1619 


N 


ASP 


210 


ATOM 


1620 


CA 


ASP 


210 


ATOM 


1621 


C 


ASP 


210 


ATOM 


1622 


o 


ASP 


210 


ATOM 


1623 


CB 


ASP 


210 


ATOM 


1624 


CG 


ASP 


210 


ATOM 


1625 


OD1 


AS? 


210 


ATOM 


1626 


OD2 


ASP 


210 


ATOM 


1627 


N 


PRO 


211 


ATOM 


1628 


CA 


PRO 


211 


ATOM 


1629 


C 


PRO 


211 


ATOM 


1630 


O 


PRO 


211 


ATOM 


1631 


CB 


PRO 


211 


ATOM 


1632 


CG 


PRO 


211 


ATOM 


1633 


CD 


PRO 


211 


ATOM 


1634 


U 


ASN 


212 


ATOM 


1635 


CA 


ASN 


212 


ATOM 


1636 


C 


ASN 


212 


ATOM 


1637 


o 


ASN 


212 


ATOM 


i63e 


CB 


ASN 


212 


ATOM 


1639 


CG 


ASN 


212 


ATOM 


1640 


N 


GLU 


2 13 


ATOM 


1641 


CA 


GLU 


4 13 


ATOM 


1642 


C 


GLU 


213 


ATOM 


1643 


o 


GLU 


213 


ATOM 


1644 


CB 


GLU 


213 


ATOM 


1645 


CG 


GLU 


213 


ATOM 


1646 


CD 


GLU 


zl3 


ATOM 


1647 


OE1 GLU 


213 


ATOM 


1648 


OE2 GLU 


213 


ATOM 


1649 


N 


LYS 


214 


ATOM 


1650 


CA 


LYS 


214 


ATOM 


1651 


C 


LYS 


214 


ATOM 


1652 


o 


LYS 


214 


ATOM 


1653 


CB 


LYS 


-14 


ATOM 


1654 


CG 


LYS 


•I/ 
-i 14 


ATOM 


1655 


CD 


LYS 




ATOM 


1656 


N 


ARG 


w ID 


ATOM 


1 C C "T 

Iba 7 


CA 


ViRG 


tic 

■ 19 


ATOM 


' CCD 


C 


ARG 


" 1 C 
_ — J 


ATOM 


1659 


O 


ARG 


215 


ATOM 


1660 


C3 


ARG 


:is 


ATOM 


1661 


CG 


ARG 


215 


ATOM 


1662 


CO 


ARG 


:is 


ATOM 


1663 


HE 


ARG 


:i5 


ATOM 


1664 


CZ 


ARG 


215 


ATOM 


1665 


:;hi arg 




ATOM 


1666 


HH2 ARG 


:is 


ATOM 


1667 




AS? 


216 


ATOM 


1668 


CA 


.-.S? 


116 



32/36 



28.955 32.790 27.707 1.00 13.73 
29.263 20.406 23.400 1.00 18.79 
29.011 28.B63 23.698 1.00 15.25 
28.914 28.692 22.270 1.00 13.74 
27.449 28.852 21.794 1.00 20.16 
26.548 29.085 22.594 1.00 15.81 
29.495 27.367 21.822 1.00 17.82 
28.769 26.311 22.431 1.00 31.45 
27.242 28.738 20.48S 1.00 16.50 
25.907 28.828 19.906 1.00 18.02 
25.637 27.610 19.031 1.00 29.99 
26.578 27.004 18.502 1.00 32.55 
25.783 30.100 19.082 1.00 20.96 
24.746 31*055 19.606 1.00 34.50 
25,262 31,964 20,666 1.00 22.72 
24.370 33.159 20,896 1.00 18.96 
23.565 33.067 22.116 1.00 27.39 
24.347 27.241 18.912 1.00 27.01 
23.890 26.159 18.038 1.00 24.62 
23.465 26.793 16.705 1.00 26.77 
22.468 27.514 16.605 1.00 23.00 
22.744 25.361 18.691 1.00 24.43 
22.197 24.249 17.839 1.00 35.55 
22.333 24.185 16.631 1.00 36.53 
21.499 23.400 18.535 1.00 45.51 
24.306 26.618 15.708 1.00 30.25 
24.120 27.224 14.397 1.00 30.30 
22.733 26.982 13.770 1.00 39.72 
22.253 27.782 12.959 1.00 37.65 
25.197 26.620 13.500 1.00 29.99 
25.782 25.418 14.255 1.00 38.59 
25,158 25.405 15.647 1.00 35.05 
22.102 25.868 14.140 1.00 39.64 
20.808 25.515 12.592 1.00 39.60 
19.642 25.894 14.497 1.00 41.92 
18.485 25.518 14.263 1.00 42.30 
20.733 24.028 13.225 1.00 48.64 
21.883 23.678 12.230 1.00 S3. 51 
19.947 26.675 15.520 1.00 27.84 
18.953 27.080 16.473 1.00 20.43 
18.485 28.527 16.241 1.00 29.95 
19.247 29.475 16.324 1.00 32.77 
19.535 26.878 17.894 1.00 16.45 
18.594 27.326 13.995 1.00 13.29 
17.229 26.703 13.853 1.00 38.01 
16.238 27.334 18.508 1.00 25.07 
17.223 25.423 19.122 1.00 19.17 
17.223 23.713 15.963 1.00 22.99 
16.721 30.081 15.726 1.00 22.34 
16.252 30.778 16.982 1.00 21.50 
16.130 22.016 17.032 1.00 28.15 
15.653 20.197 14.606 1.00 27.58 
16.153 29.816 12.209 1.00 32.71 
16.752 30.979 12.431 1.00 55.31 
15.947 20.028 13.014 1.00 14.52 
15.518 20.726 19.209 1.00 15.58 
16.719 21.282 19.892 1.00 21.87 
17.848 21.075 19.572 1.00 26.69 
14.808 23.804 10.159 1.00 18.82 
13.660 29.067 19.47S 1.00 23.20 
13.220 27.806 20.205 1.00 15.45 
14.107 25.663 19.929 1.00 23.08 
14.022 25.473 20.543 1.00 21.38 
13.074 25.215 21.455 1.00 22.92 
14.893 14.514 20.225 1.00 20.46 
16.466 22.275 10.330 1.00 16.72 
17.556 22.395 11.517 1.00 19.06 
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FIG 5-26 

ATOM 1669 C ASP 216 

ATOM 1670 O ASP 216 

ATOM 1671 CB ASP 216 

ATOM 1672 CG ASP 216 

ATOM 1673 OD1 ASP 216 

ATOM 1674 OD2 ASP 216 

ATOM 1675 N HIS 217 

ATOM 1676 CA HIS 217 

ATOM 1677 C HIS 217 

ATOM 1678 O HIS 217 

ATOM 1679 CB HIS 217 

ATOM 16B0 CG HIS 217 

ATOM 1681 ND1 HIS 217 

ATOM 1682 CD2 HIS 217 

ATOM 1683 CE1 HIS 217 

ATOM 1684 NE2 HIS 217 
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18.047 31,817 22.607 1.00 20.02 

17.261 31.214 23.350 1.00 18.45 

17.066 34.169 22.383 1.00 21.33 

18.138 35.140 22.893 1.00 20.97 

17.869 36.079 23.620 1.00 28.46 

19.342 34.900 22.441 1.00 20.37 

19.332 31.537 22.589 1.00 13.18 

19.813 30.482 23.433 1.00 11.21 

21.313 30.614 23.723 1.00 21.35 

22.014 31.471 23.163 1.00 15.03 

19.587 29.168 22.690 1.00 13.03 

20.525 29.025 21.542 1.00 15.49 

20.463 29.871 20.449 1.00 17.88 

21.589 28.172 21.361 1.00 17.51 

21.457 29.524 19.635 1.00 17.94 

22.152 28.501 20.151 1.00 17.59 

21.794 29.725 24.576 1.00 11.26 

23.186 29.642 24.887 1.00 11.49 

23.560 28.198 25.094 1.00 24.15 

22.822 27.446 25.751 1.00 20.70 

23.539 30.421 26.172 1.00 12.84 
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FIG 5-27 
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34.542 22.081 36.765 1.00 r.23 

34.708 22.587 38.080 1.00 11.13 

35.324 21.553 39.010 1.00 17.52 

34.848 20.418 39.137 1.00 12.17 

33.370 23.078 38.662 1.00 16. Si 

33.622 23.736 40.022 1.00 13.90 

32.674 24.048 37.697 1.00 13.85 

36.380 21.965 39.676 1.00 11.71 

37.026 21.099 40.617 1.00 11.51 

37.366 21.798 41.927 1.00 14.76 

37.702 23.002 41.962 1.00 16.64 

38.162 20.279 40.014 1.00 20.33 

39.288 20.337 40.822 1-00 30.44 

38.468 20.722 38.631 1.00 10.89 

37.222 21.065 43.011 1-00 7.89 

37.478 21.595 44.352 1.00 11.63 

38.969 21.558 44.677 1.00 16.61 

39.687 20.699 44.199 1.00 15.60 

36.695 20.847 45.444 1.00 12.27 

39.395 22.490 45.479 1.00 13.95 

40.789 22.550 45.871 1.00 19.64 

40.987 23.299 47.170 1.00 26.23 

40.042 23.715 47.840 1.00 25.29 

41.557 23.246 44.760 1.00 18.42 

42.245 23.476 47.523 1.00 23.23 

42.616 24.292 48.653 1.00 21.61 

42.805 23.S62 49.939 1.00 32.93 

42.948 24.201 51.009 1.00 22.53 

42.803 22-231 49.842 1.00 33.59 

43.006 21.375 50.998 1.00 31.81 

44.016 20.291 50.633 1.00 28.78 

45.090 20.176 £1.246 1.00 96.02 

41.691 20.772 51.519 1.00 35.70 

40.890 21.807 52.325 1.00 30.66 

41.990 19.549 52.392 1.00 33.37 

39.386 21.715 52.092 1.00 28.74 

27.530 12.735 38.010 1.00 15.09 

23.919 34.589 27.331 1.00 1C.29 

27.229 34.816 35.487 1.00 11.12 

29.914 13.943 44.692 1.00 16.10 

30.956 21.886 49.900 1.00 21.47 

20.072 31-196 43.592 1.00 16.55 

26.660 48.630 33.797 1.00 24.57 

22.329 33.239 41.399 1.00 14.11 

22.465 48.025 32.810 1.00 IS. 51 

31.012 39.126 29.118 1.00 15.01 
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FIG 5-28 
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34.942 24.730 29.532 1.00 38.93 

25.235 12.919 54.611 1.00 36.20 

38.048 23.467 36.645 1.00 37.73 

12.284 43.511 38.338 1.00 33.79 

9.826 47.020 32.568 1.00 46.67 

7.671 41.532 29.806 1.00 40.88 

15.430 23.713 26.808 1.00 34.73 

24.344 20.385 25.121 1.00 53.42 

31.550 10.656 40.819 1.00 47.85 

17.569 23.030 25.796 1.00 28.17 

19.174 38.552 23.965 1.00 45.54 
24.268 37.527 25.415 1.00 30.97 
21.266 29.482 41.551 1.00 19.69 
20.668 26.999 41.933 1.00 11.81 
24.780 24.795 43.460 1.00 20.95 
42.962 13.170 46.312 1.00 31.00 
32.322 14.088 47.013 1.00 28.20 
31.708 13.186 49.679 1.00 35.57 
22.408 35.801 50.S14 1.00 40.71 
25.366 47.090 42.583 1.00 38.15 
27.243 47.647 43.977 1.00 41.55 
29.868 45.076 42.906 1.00 29.32 

14.175 22.269 42.680 1.00 74.11 
13.414 10.739 35.791 1.00 29.92 
20.338 9.974 37.765 1.00 30.46 
23.520 40.420 24.953 1.00 29.75 
25.718 41.692 26.023 1.00 30.43 
26.826 38.466 25.345 1-00 31.72 
37.768 42.373 25.123 1.00 41.53 
40.078 42.268 25.852 1.00 37.12 
31.483 38.677 22.083 1.00 54.21 
33.891 37.723 30.126 1.00 23.35 
39.935 26.543 36.329 1.00 47. S3 
36.631 34.210 41.636 1.00 62.74 
37.038 29.783 52.197 1.00 40.07 
37.289 37.407 40.231 1.00 37.59 
18.930 17.517 52.472 1.00 35.80 
19.506 18.914 57.913 1.00 45.72 
30.903 25.708 41.139 1.00 21. S4 
30.369 25.678 24.583 1.00 22.46 
21.000 33.705 20.826 1.00 26.00 
13.648 22.794 21.329 1.00 27.98 
29.735 25.683 38.707 1.00 21.00 
33.670 24.419 60.503 1.00 50.04 
30.034 11.047 37.420 1.00 43.28 

8.662 25.846 35.068 1.00 51.94 

10.847 26.466 39.503 1.00 42.32 

14.395 48.943 39.085 1.00 29.72 

36.676 11.660 40.172 1.00 39.81 

35.968 7.212 34.763 1.00 58.66 

17.426 21.988 21.077 1.00 41.69 

29.837 22.623 39.378 1.00 32.82 

23-855 29.386 55.164 1.00 55.00 

17.408 25.360 47.495 1.00 61.61 

27.900 49.720 42.448 1.00 47.70 

13.932 26.220 44.385 1.00 45.08 

12.650 22.021 43.288 1.00 49.86 

16.974 42.357 43.435 1.00 34.38 

37.335 42.653 28.295 1.00 64.46 

29.701 49.856 35.323 1.00 62.61 

27.267 =0.825 23.976 1.00 66.60 

19.661 13.131 51.537 1.00 34.01 

29.412 17.505 59.089 1.00 51.78 
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The method of engineering fluorescent protein of Group IV is different from that of producing fluorescent resonance 
energy transfer of Group V because the resulting fluorescence is different in each case and vary m its characteristics. 
Similarly the special technical features of each of Groups IV and V are different from those of the crystal of Group VI. 
the computation method of Group VII. the storage device of Group VIII, and the method of identifying chemicals of 
Group IX. Finally, the crystal of Group VI. the computation method of Group VII. the storage device of Group VIII, and 
the method of identifying chemicals of Group IX are clearly unrelated to each other and there is no special technical 
feature that connects them together. 
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